PD – TRAINING
Topic: CTS
Author: Nilesh Ingale &
P. Ravikumar
Date:08-11-2012
Confidential Information: Do not share or
photocopy without prior written approval 1
INTRODUCTION and SCOPE
After completing this unit, you should be able to:
1.) List the status of the design prior to CTS
2.) Set up the design for clock tree synthesis
3.) Identify implicit clock tree start/end points and when
explicit modifications are needed
4.) Control the constraints and targets used by CTS
5.) Execute the recommended clock tree synthesis and
optimization flow
6.) Analyze timing and clock specifications post CTS
Confidential Information: Do not share or
photocopy without prior written approval 2
Why Clocks?
Clocks provide the means to synchronize
By allowing events to happen at known timing boundaries, we can
sequence these events
Greatly simplifies building of state machines
No need to worry about variable delay through
combinational logic (CL)
All signals delayed until clock edge (clock imposes the worst case
delay)
Confidential Information: Do not share or
photocopy without prior written approval 3
Prior to Clock Tree Synthesis (pre-CTS)
Clock buffer tree is typically not built yet
Clock input ports are connected directly to all FF clock pins
Confidential Information: Do not share or
photocopy without prior written approval 4
What does CTS do?
Inserts buffers to meet skew, latency and transition goals
Goals are set by the user, the vendor and/or PnR tool.
Confidential Information: Do not share or
photocopy without prior written approval 5
What does Timing Analysis do Pre-CTS?
Timing analysis assumes ‘ideal’ clock networks by default ->
zero skew, latency and transition
Ignores buffers even if they are
present
This would produce overly
optimistic timing results
Confidential Information: Do not share or
photocopy without prior written approval 6
Modeling Clock Tree Effects Pre-CTS
Your SDC constraints should contain
the following set of commands for
each clock domain:
set_clock_uncertainty
set_clock_latency
set_clock_transition
The clocks are still considered
to be ‘ideal’, but the zero values
are overridden by the values
specified by these SDC commands
Confidential Information: Do not share or
photocopy without prior written approval 7
Design Status, Start of CTS Phase
Placement – completed
Power and ground nets – prerouted
Estimated congestion – acceptable
Estimated timing – acceptable (~0ns slack)
Estimated max cap/transition – no violations
High fanout nets:
Reset, Scan Enable synthesized with buffers
Clocks are still not buffered
Confidential Information: Do not share or
photocopy without prior written approval 8
Is the Design Ready for CTS?
check_physical_designs –for_cts checks for:
designs is placed
clocks have been defined
clock roots are not hierarchical pins
check_clock_tree checks and warns if:
a clock source pin is a hierarchical pin
a generated-clock with improperly specified master-clock
a clock tree has no synchronous pins
there are multiple clocks per register
Confidential Information: Do not share or
photocopy without prior written approval 9
Starting Point before CTS
Confidential Information: Do not share or
photocopy without prior written approval 10
Basic Terminology
PLL
Clock Period
Clock Latency
Source Latency
Network Latency
Clock Uncertainty
Setup & Hold
Constraints
Max Capacitance
Max Fanout
Max Transition
Skew
Global skew
Local skew
Useful skew
Confidential Information: Do not share or
photocopy without prior written approval 11
Clock Period
For synchronized designs, data transfer between functional
elements are synchronized by clock signals
Clock signal are generated externally (e.g., by PLL)
Clock period equation
clock period td tskew tsu
Td : Longest path through combinational logic
Tskew : Clock skew
Tsu : Setup time of the synchronizing elements
Confidential Information: Do not share or
photocopy without prior written approval 12
Clock Skew
Clock skew is the maximum difference in the arrival time of a
clock signal at two different components.
Clock skew forces designers to use a large time period between
clock pulses. This makes the system slower.
So, in addition to other objectives, clock skew should be
minimized during clock routing.
Confidential Information: Do not share or
photocopy without prior written approval 13
Clock Skew Causes
Designed (unavoidable) variations – mismatch in buffer load
sizes, interconnect lengths
Process variation – process spread across die yielding different
Leff (Effective Channel length), Tox (oxide thickness), etc.
values
Temparature gradients – changes MOSFET performance across
die
IR voltage drop in power supply – changes MOSFET
performance across die
Confidential Information: Do not share or
photocopy without prior written approval 14
Clock Latency
Clock source latency is defined as the delay from Clock source
to clock definition port in your design
Clock network latency is defined as the delay from the Clock
definition port to clock sink of your design
– It is also known as inserti.on delay (standard term)
Confidential Information: Do not share or
photocopy without prior written approval 15
Zero Skew Methodologies
Global Skew
– Achieve zero skew between 2 synchronous pins, without
considering logic relationships
Skew global = LP – SP
Skew global = 3 - 2 = 1
• Local Skew
– Achieve zero skew between 2 synchronous pins, while
considering logic relationships
Skew local = LP – SP FF1 FF2 FF3
Skew local = 2.5 - 2 = 0.5
Confidential Information: Do not share or
photocopy without prior written approval 16
Useful Skew
• Skewing a clock to improve timing
• Determines clock insertion values based on logic
path delay
• Evenly distributes slack to adjacent paths
Useful Skew = D source – D target
Setup Slack = T – skew – data path
Confidential Information: Do not share or
photocopy without prior written approval 17
Clock Design Problem
What are the main concerns for clock design?
Skew
No. 1 concern for clock networks
For increased clock frequency, skew may contribute over
10% of the system cycle time
Power
very important, as clock is a major power consumer!
It switches at every clock cycle!
Clock Consumes ~ 40% of the power.
Noise
Clock is often a very strong aggressor
May need shielding
Delay
Not really important
But slew rate is important (sharp transition)
Confidential Information: Do not share or
photocopy without prior written approval 18
Clock Design Considerations
Clock signal is global in nature, so clock nets are usually
very long.
Significant interconnect capacitance and resistance
So what are the techniques?
Routing
Clock tree versus clock mesh (grid)
Balance skew and total wire length
Buffer insertion
Clock buffers to reduce clock skew, delay, and distortion
in waveform.
Wire sizing
To further tune the clock tree/mesh
Confidential Information: Do not share or
photocopy without prior written approval 19
Clock Jitter
Variations in clock arrival time at inputs of a sequencing
Element
Random and Deterministic components
Varies cycle to cycle
Contrast with Clock skew: measures the average difference in arrival times
of the clock at two different sequencing elements.
Period jitter: Variations in period When referenced to Ideal
clock.
Cycle to cycle jitter: Variations in next Edge when referenced
to previous edge
Confidential Information: Do not share or
photocopy without prior written approval 20
Clock Jitter: Flip flops
Confidential Information: Do not share or
photocopy without prior written approval 21
What causes clock jitter?
Confidential Information: Do not share or
photocopy without prior written approval 22
Clock Distribution Network
General goal of clock distribution
Deliver clock to all memory elements with acceptable skew
Deliver clock edges with acceptable sharpness
Clocking network design is one of the greatest challenges in the
design of a large chip
Consume up to 1/3 of chip power
Accurate signal delay
Signal integrity
Subject to uncertainty / variation of different processes /operating
conditions
Confidential Information: Do not share or
photocopy without prior written approval 23
Clock design Components
Oscillator
Dividers
Buffers
Strong drivers
Reduce delay
Signal integrity / slew rate
Interconnects
Balanced trees, meshes, etc.
Shielding (e.g., for crosstalk reduction)
Non-tree links / feedback loops
Confidential Information: Do not share or
photocopy without prior written approval 24
Clock Distribution Objective
Minimum / bounded skew
performance / hold time requirements
Guaranteed slew rate / signal integrity
Small insertion delay
Robustness under process / operating condition variation
Minimum cell / routing area
Minimum power consumption
Confidential Information: Do not share or
photocopy without prior written approval 25
Clock Distribution Robustness Subject
to
Radically different loading (flip-flop density)
Across the die
ECO (Engineering Change Order)
Interconnect coupling
Signal integrity
Delay variation
Process variation
From lot-to-lot
Across the die
Buffers
Metal width
Supply voltage variation across the die
Both static IR drop
Dynamic voltage drop
Temperature
Confidential Information: Do not share or
photocopy without prior written approval 26
Issues in Clock Distribution Network
Design
Skew
Process, voltage, and temperature
Data dependence
Noise coupling
Load balancing
Power, CV2f (consume up to 1/3 of total chip power)
Clock gating
Flexibility/Tunability
Compactness – fit into existing layout/design
Facilitate ECO
Confidential Information: Do not share or
photocopy without prior written approval 27
Clock Tree Synthesis
Confidential Information: Do not share or
photocopy without prior written approval 28
CTS Goals
Meet the clock tree Design Rule Constraints (DRC):
Maximum transition delay
Maximum load capacitance
Maximum fanout
Maximum buffer levels
Meet the clock tree targets:
Maximum skew
Min/Max insertion delay
Confidential Information: Do not share or
photocopy without prior written approval 29
Clock Tree Synthesis (CTS) (1/2)
Confidential Information: Do not share or
photocopy without prior written approval 30
Clock Tree Synthesis (CTS) (2/2)
Confidential Information: Do not share or
photocopy without prior written approval 31
Where does the Clock Tree Begin and
End?
Confidential Information: Do not share or
photocopy without prior written approval 32
Define Clock Root Attributes (1/2)
When the clock root is a primary port of a block
Ensure that an appropriate driving cell is defined
set_driving_cell
The synthesis constraints may include a weak driving cell for
all inputs, including the clock port
Because the clock is ideal during synthesis it has no effect
on design QoR
But a weak driver on the clock port affects clock tree QoR
during CTS
Confidential Information: Do not share or
photocopy without prior written approval 33
Define Clock Root Attributes (2/2)
When the clock root is a primary port, but at the CHIP level
through an IO-PAD
Ensure that an appropriate input transition is defined
set_input_transition
Confidential Information: Do not share or
photocopy without prior written approval 34
Stop, Float and Exclude Pins
Confidential Information: Do not share or
photocopy without prior written approval 35
Leaf Pins
leaf_pins: Define speciafic pins as leafs, i.e. stop tracing
the clock when encountered
Confidential Information: Do not share or
photocopy without prior written approval 36
Generated and Gated Clocks
Confidential Information: Do not share or
photocopy without prior written approval 37
Skew Balancing not Required?
Confidential Information: Do not share or
photocopy without prior written approval 38
User-defined or Explicit Stop Pins
Confidential Information: Do not share or
photocopy without prior written approval 39
Defining an Explicit Stop Pin
Confidential Information: Do not share or
photocopy without prior written approval 40
Defining an Explicit Float Pin
Confidential Information: Do not share or
photocopy without prior written approval 41
Preserving Pre-Existing Clock Trees
Confidential Information: Do not share or
photocopy without prior written approval 42
Impact of Preexisting Clock Cells
Any preexisting clock buffers and cells are counted
as clock gate levels
Any clock gate level is considered as a balancing point,
therefore…
Preexisting clock buffers/inverters might create
unnecessary clock levels for CTS
Use remove_clock_tree to remove existing clock buffers
Will generally lead to higher quality clock trees
Confidential Information: Do not share or
photocopy without prior written approval 43
Specifying Skew / Insertion Delay
Targets
Refer to CTS script.
Confidential Information: Do not share or
photocopy without prior written approval 44
Set Buffer/Inverter Selection Lists
To limit CTS to a list of buffers/inverters used for
specific optimizations:
Command:
There is no priority on how CTS uses the members from
each list
If a list is not specified, all buffers/inverters in the library
without dont_use attributes are used
Make sure the references are in target_library
Confidential Information: Do not share or
photocopy without prior written approval 45
When Clock Tree DRCs are Used
Confidential Information: Do not share or
photocopy without prior written approval 46
Non-Default Clock Routing
PnR tool can route the clocks using non-default routing rules,
e.g. double-spacing, double-width, shielding
Non-default rules are often used to “harden” the clock, e.g. to
make the clock routes less sensitive to Cross Talk or EM effects
Confidential Information: Do not share or
photocopy without prior written approval 47
NDR Recommendations
Always route clock on metal 3 and above
Avoid NDR on clock sinks:
set_clock_tree_options -
use_default_routing_for_sinks 1
Avoid NDR on Metal 1
may have trouble accessing metal 1 pins on buffers and
gates
Put NDR on pitch – try to avoid blind double spacing
Preserve routing resources/keep preroute RC estimation
accurate
Consider double width to reduce resistance
Consider double via to reduce resistance and improve yield
Confidential Information: Do not share or
photocopy without prior written approval 48
Effects of Clock Tree Synthesis
Clock buffers added
Congestion may increase
Non clock cells may have been
moved to less ideal locations
Can introduce new timing
and max tran/cap violations
Confidential Information: Do not share or
photocopy without prior written approval 49
Post CTS / Optimization
clock_opt –only_psyn
Reduces disturbances to other cells as much as possible
Performs logical and placement optimizations to fix possible
timing and max tran/cap violations, based on propagated
clock arrivals
To enable hold time fixing
To prioritize TNS over WNS, set:
To prioritize min over max, set:
Confidential Information: Do not share or
photocopy without prior written approval 50
Minimize Hold Time Violations in Scan
Paths Reordering
.
Reorders to minimize crossings between clock
buffers
Can reduce unnecessary hold time violations in the
scan chain
Confidential Information: Do not share or
photocopy without prior written approval 51
Recommended Flow
All CTS-built clocks are propagated automatically – no
need to use the “set_propagated_clock” command!
Confidential Information: Do not share or
photocopy without prior written approval 52
Analysis using the CTS GUI
CTS browser
Properties and attributes on clock tree objects
Traversing clock tree levels
Symbols for CTS objects like buffers, gates and sinks
CTS schematic
Trace forward/backward in schematic view
Collapses all sinks in the fanout of a CTS buffer for clearer
CTS schematic
Highlight CTS objects in the layout view
Clock arrival histogram
Confidential Information: Do not share or
photocopy without prior written approval 53
Analyzing CTS Results
report_clock_tree
-summary
-settings
-...
Reports Max global skew, Late/Early insertion delay, Number
of levels in clock tree, Number of clock tree references
(Buffers), Clock DRC violations
report_clock_timing
Reports actual, relevant skew, latency, interclock latency
etc. for paths that are related.
Example: report_clock_timing –type skew
Confidential Information: Do not share or
photocopy without prior written approval 54
What about CTS Operating Conditions?
What happens when building the CT using min_max?
The tree is compiled in –max then analyzed in –min
If the skew analyzed in –min is not worse than the skew
in –max, compiling with –min_max will not make much
difference
If the skew analyzed in –min is worse than that in –max,
then compile in –min_max will build a tree with a better
skew in –min at the cost of a possibly worst skew in –max
In summary, a tree compiled in –min_max will build a tree with
less skew variation when analyzed in both –min and –max
The skew will not be better than a tree compiled and analyzed in
–max
Confidential Information: Do not share or
photocopy without prior written approval 55
Clock Tree Optimization
Perform additional Clock Tree Optimization as
necessary to further improve clock skew.
Confidential Information: Do not share or
photocopy without prior written approval 56
Invoke CTS: Core Command
Confidential Information: Do not share or
photocopy without prior written approval 57
clock_opt use recommendation
Using clock_opt in the following manner has been
found to be more flexible across designs and flows:
clock_opt -only_cts -no_clock_route
analyze…
clock_opt -only_psyn -no_clock_route
analyze…
route_group -all_clock_nets
Confidential Information: Do not share or
photocopy without prior written approval 58
Clock Tree Optimization Techniques
• Buffer/Gate Sizing
• Buffer/Gate Relocation
• Level Adjustment
• Reconfiguration
• Delay Insertion
• Dummy Load Insertion
Confidential Information: Do not share or
photocopy without prior written approval 59
Gate/Buffer Sizing
• Sizes up or down buffers/gates to improve both
skew and insertion delay
• These are LEQ cells extracted by the tool
• Users can limit some buffers/gates in the LEQ lists
Confidential Information: Do not share or
photocopy without prior written approval 60
Gate/Buffer Relocation
• Physically moves cells to reduce skew and
insertion delay
• Calls Overlap/Removal engine
Confidential Information: Do not share or
photocopy without prior written approval 61
Level Adjustment
• Adjusting a pin to its upper or lower logic
equivalent net
Confidential Information: Do not share or
photocopy without prior written approval 62
Reconfiguration
• Re-clustering of sequential logic
• Buffer placement performed after re-clustering –
runtime intensive
• Recommended for small clock trees
Confidential Information: Do not share or
photocopy without prior written approval 63
Delay Insertion
• Works on low fan-out nets where no clock tree is
inserted
• Delay cells may be specified by users or extracted
by the tool
Confidential Information: Do not share or
photocopy without prior written approval 64
Dummy Load Insertion
• Load balance function
• Uses a cells input capacitance to increase loading
• Dummy Load cells may be specified by users or
extracted by the tool
Confidential Information: Do not share or
photocopy without prior written approval 65
(Embedded) Clock Tree Optimization
Confidential Information: Do not share or
photocopy without prior written approval 66
Balancing Multiple Synchronous Clocks
Confidential Information: Do not share or
photocopy without prior written approval 67
Inter-Clock Delay Balancing
Confidential Information: Do not share or
photocopy without prior written approval 68
Inter-Clock Delay Balancing with Offset
Confidential Information: Do not share or
photocopy without prior written approval 69
SDC Latencies
CTS does not respect SDC latencies by default!
If you need your insertion delays to match the SDC
provided latencies, perform clock tree balancing
Note: Insertion delay will not be minimized if given SDC
latency is less than initial CTS insertion delay
Confidential Information: Do not share or
photocopy without prior written approval 70
CTS – Checklist.
Prerequisite Check
Make sure the design is legally placed
Make sure all the clocks and clock constraints are defined
Is source of generated clock really a clock source (make sure
that there is a create_clock defined on the source net)?
Can create_generated_clock trace back along a real path to the
clock source? If not, the sinks of the generated clocks will not be
balanced with the sinks of the source.
If Clock definitions on hierarchical ports are not supported in
during clock tree synthesis; if any such definitions exists,
redefine the clock on the output pin of the driver of the
hierarchical port.
Confidential Information: Do not share or
photocopy without prior written approval 71
CTS Check List Contd…
Clock Exceptions Check
If you use set_clock_tree_exceptions to specify a particular pin
as a stop_pin, float_pin or exclude_pin,the last one takes
precedence.
Clock-related attributes and nondefault rules are propagated in
spite of dont_touch_subtrees being specified; use set
cts_traverse_dont_touch_subtrees false to override this feature.
Confidential Information: Do not share or
photocopy without prior written approval 72
CTS checklist Contd….
Timer-Related Check
Use report_disable_timing to make sure that the disabled timing
arcs are intentional.
Use report_case_analysis to make sure that the
set_case_analysis are intentional and make sense; use
remove_case_analysisto remove the incorrect ones.
The set_timing_derate command is ignored by clock tree
synthesis and report_clock_tree; use report_timing_derate to
check.The report_clock_timing and report_timing commands
honor set_timing_derate.
The message "Invalid phase delay at pin xx/yy" implies a
problem - open a STAR; this message is printed only in debug
mode (set cts_use_debug_mode true).
Confidential Information: Do not share or
photocopy without prior written approval 73
Clock Tree Synthesis Best Practices
1.) Big Insertion delays.
Are there delay cells in the design?
Check to see if there are delay cells in the netlist that are present
in the current design. These could be causing delay that cannot
be optimized and CTS is building clock trees which match all
other paths to this worst insertion delay.
Are there cells marked "don’t touch" in the design?
There could be cells in the design that are marked "don’t touch“
which prevents CTS from deleting them and building optimal
clock trees.
Can the floorplan be modified to be more clock friendly?
Sometimes it helps to consider CTS (and timing) as a constraint for
floorplanning. Long skinny channels leading to more long skinny
placement channels will give both timing optimization and CTS
problems. Consider using soft blockages or refloorplan.
Confidential Information: Do not share or
photocopy without prior written approval 74
CTS Best Practices Contd...
Can you define new create_clocks that will assist CTS(divide and
rule)?
Many times running CTS on the main clock pin is not the optimal
way to build clock trees. It may help to divide the clock tree based
on the floorplan and the syncpins and build sub clocks, then
define the sync pins and build the upper main clock.
Are the syncPins defined correctly for macros?
It is a good idea to check the syncPins file to see if the sync pins
make sense. Also check that the numbers are accurate and that
the time units are correct.
If there are ignore pins in the design are they defined as ignore
pins?
If there are ignore pins in the design, make sure you define these as
ignore pins before running CTS.
Confidential Information: Do not share or
photocopy without prior written approval 75
CTS Best Practices. Contd...
Have you used varRouteRules and propogated by
astMarckClockTree?
Defining varRouteRules helps to reduce the insertion delay. Define
shielding, and double or more width rules for clock nets, and
propagate them using astMarkClockTree.
Are the CTU buffers marked as "dont use"?
Some technologies use clock tree buffers. Make sure you are
using these only for your clock tree. Also make sure they are not
marked "dont use".
Be creative and use different CTS intParams to get better
results.
There are several CTS options in the form that you can try to
change to get better or more desirable CTS results.
Confidential Information: Do not share or
photocopy without prior written approval 76
CTS Best Practices Contd...
Use the Block option in CTS in the first attempt.
This usually gives better insertion and skew results. If your design
is less than 5% std cell utilization try the Top option.
CTO is designed to work on skew and will not reduce insertion
delay once it is built.
Try providing a higher skew goal during CTS.
Use inverters only to build the clock tree if possible.
Define variable route rules with greater than default widths and
clearance and also shield the clock nets.
Then propagate these rules using astMarkClockTree. This will
help insertion delay.
Confidential Information: Do not share or
photocopy without prior written approval 77
CTS Best Practices for Unreasonable
skew.
Do you have derived clocks that do not need skew matching?
If you have clocks that get divided and some branches do not need
skew balancing with the rest, then build clock trees for them
separately and do not allow skew calculation between them.
You can define sync pins or ignore pins at cross-over points.
Look closely at your worst path(s) for possible culprits.
It is quite likely that some of your worst paths have an issue which
is preventing CTS from optimizing them and is causing all other
paths to get delay added to match the insertion delay or better
skew.
Confidential Information: Do not share or
photocopy without prior written approval 78
CTS doesn't run properly?
Are the SDC constraints loaded and is create_clock defined?
If there are no create_clock statements in the SDC file loaded,
CTS will not run. Make sure you have at least one create_clock
in your SDC file. It is good practice to have set_clock_transition,
set_clock_latency, and set_clock_uncertainty also defined. For
the SDC latency values to be honored, the intParam
axSetIntParam "acts" "clock uncertainty goal" 1 should be set.
CTS uses constraints in the CTS form as first priority, then it
uses the constraints in the intParams, and then it uses SDC
constraints. Having these in the SDC file will also enable the
timer to account for your skew and insertion delay in
optimization steps.
Confidential Information: Do not share or
photocopy without prior written approval 79
Build the clock tree on lower clocks, then define the sync pins
and run CTS on next level up (divide and conquer).
This is a good practice when building clock trees. Always
remember to define sync pins if you need them.
astSetDontTouch ?clock_buffers.list? #f done?
If your CTS buffers have a "dont use" property in your library, you
need to set that to false.
Are the clock nets marked "dont touch" or is set_case_analysis
defined?
Occasionally you may end up with a "dont touch" property on your
clock net as a results of your analysis. Make sure you reset this
using the astmarkClockTree command. Also if your SDC
constraints have a set_case_analysis defined that disables the
clock net, CTS will not build clock trees.
Confidential Information: Do not share or
photocopy without prior written approval 80
Is create_clock defined on a non-physical hierarchical pin?
If you define create_clock on a pin that is not present physically
and is only present in the heirarchical netlist, CTS will not be
able to run.
Try different CTS options and use the one that gives the best
results.
As always, it is a good idea to experiment and try out different CTS
options and intParams to get the best result.
Confidential Information: Do not share or
photocopy without prior written approval 81
Clock Distribution Structures
Confidential Information: Do not share or
photocopy without prior written approval 82
Grids
Gridded clock distribution common on earlier DEC Alpha
microprocessors
Advantages:
Skew determined by grid density,
not too sensitive to load position
Clock signals available everywhere
Tolerant to process variations
Usually yields extremely low skew values
Disadvantages:
Huge amount of wiring and power
To minimize such penalties, need to
make grid pitch coarser lose
the grid advantage
Confidential Information: Do not share or
photocopy without prior written approval 83
H-Tree
H-tree
One large central driver, recursive structure to match
wirelengths
Halve wire width at branching points to reduce reflections
Disadvantages
Slew degradation along long RC paths
Unrealistically large central driver
- Clock drivers can create large temperature
gradients.
Non-uniform load distribution
Confidential Information: Do not share or
photocopy without prior written approval 84
Buffered H-tree
Advantages
Ideally zero-skew
Can be low power (depending on skew requirements)
Low area (silicon and wiring)
CAD tool friendly (regular)
Disadvantages
Sensitive to process variations
Devices Want same size buffers at each level of tree
Wires Want similar segment lengths on each layer in each source-sink path !!!
Local clocking loads inherently non-uniform
Confidential Information: Do not share or
photocopy without prior written approval 85
Clock tree Mesh
Clock meshes are homogeneous shorted grids of metal that are
driven by many clock drivers. The purpose of a clock mesh is to
reduce clock skew in both nominal designs and designs across
variations such as on-chip variation (OCV), chip-to-chip
variation, and local power fluctuations.
Confidential Information: Do not share or
photocopy without prior written approval 86
Benefits of Meshes
Deterministic since shielded all the way down to rib distribution
No ECO placement required: all buffers preplaced before block
placement
Low latency since uses shorted (= ganged, parallel) drivers,
therefore lower skew
ECO placements of FFs later do not require rebalancing of tree
“Idealized” clocking environment for “concurrent dance” of RTL
design and timing convergence
Confidential Information: Do not share or
photocopy without prior written approval 87
Problems with Meshes
Burn more power at low frequencies
Blocks more routing resources (solution: integrated power
distribution with ribs can provide shielding for ‘free’)
Difficult for ‘spare’ clock domains that will not tolerate regioning
Post placement (and routing) tuning required
No ‘beneficial skew’ possible
Clock gating only easy at root
Fighting tools to do analysis:
Clumped buffers a problem in Static Timing Analysis tools
Large shorted meshes a problem for STA tools
What does Elmore delay calculation look like for a non-tree?
Need full extraction and SPICE-like simulation to determine skew
Confidential Information: Do not share or
photocopy without prior written approval 88
Hybrid Structure
Balanced tree on the top
Mesh in the middle
Minimize skew
Steiner minimum tree at the bottom
Minimize cost
Facilitate ECO
Confidential Information: Do not share or
photocopy without prior written approval 89
ASSIGNMETS
1.) Report QOR before starting CTS / After CTS.
-Congestion Number.
- Setup / hold, TNS.
- Area.
- Number of Flops.
2.) Derive clock tree target constraints for leon block.
3.) Build the clock tree with minimum insertion delay of 40%
Show the relation between insertion delay and skew with
values?
4.) Report Clock tree transition with different transition settings 10%,
5%, 4%.
5.) Optimize clock tree with different CTS optimization techniques.
Use two of them. Report the QOR.
Confidential Information: Do not share or
photocopy without prior written approval 90