100% found this document useful (1 vote)
3K views21 pages

21EC71 Advanced VLSI Notes Module 2

Uploaded by

narayanmalagi9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views21 pages

21EC71 Advanced VLSI Notes Module 2

Uploaded by

narayanmalagi9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

21EC71 Advanced VLSI

Advanced VLSI
(21EC71)
SEMESTER – VII

"Education is not solely about earning a great living. It means


living a great life." - Brad Henry

Module 2
Floor planning and placement: Goals and objectives, Measurement of delay in Floor
planning, Floor planning tools, Channel definition, I/O and Power planning and Clock
planning. Placement: Goals and Objectives, Min-cut Placement algorithm, Iterative
Placement Improvement, Time driven placement methods, Physical Design Flow.

Routing: Global Routing: Goals and objectives, Global Routing Methods, Global routing
between blocks, Back annotation.

Textbook 1

Vijaykumar Sajjanar

[email protected],

vjkr.github.io

www.bldeacet.ac.in Page | 1
21EC71 Advanced VLSI

CONTENTS
Floorplanning ..................................................................................................................................................................... 3
FLOORPLANNING GOALS and Objectives .......................................................................................................... 3
Measurement of Delay in Floorplanning ............................................................................................................ 4
Floorplanning Tools .................................................................................................................................................... 5
Channel Definition ....................................................................................................................................................... 6
I/O and Power Planning ............................................................................................................................................ 8
Clock Planning ............................................................................................................................................................ 10
Placement ......................................................................................................................................................................... 11
Placement Terms and Definitions ...................................................................................................................... 11
Placement Goals and Objectives ......................................................................................................................... 12
Placement Algorithms ............................................................................................................................................. 12
min-cut placement method............................................................................................................................... 13
Iterative Placement Improvement ................................................................................................................ 14
Timing-Driven Placement Methods ................................................................................................................... 16
Physical Design Flow.................................................................................................................................................... 17
ROUTING ........................................................................................................................................................................... 18
Global Routing ................................................................................................................................................................ 18
Goals and Objectives ................................................................................................................................................ 18
Global Routing Methods ......................................................................................................................................... 19
Global Routing Between Blocks .......................................................................................................................... 20
Back-annotation ........................................................................................................................................................ 21

www.bldeacet.ac.in Page | 2
21EC71 Advanced VLSI

FLOORPLANNING
Floorplanning is a mapping between the logical description (the netlist) and the
physical description (the floorplan).

The input to a floorplanning tool is a hierarchical netlist that describes the


interconnection of the blocks (RAM, ROM, ALU, cache controller, and so on); the logic cells
(NAND, NOR, D flip-flop, and so on) within the blocks; and the logic cell connectors (the
terms terminals , pins , or ports mean the same thing as connectors ).

The netlist is a logical description of the ASIC; the floorplan is a physical


description of an ASIC.

FIGURE 16.3 Interconnect and gate delays. As feature sizes decrease, both average
interconnect delay and average gate delay decrease but at different rates. This is because
interconnect capacitance tends to a limit that is independent of scaling. Interconnect delay
now dominates gate delay.

FLOORPLANNING GOALS AND OBJECTIVES


The goals of floorplanning are to:

● arrange the blocks on a chip,

● decide the location of the I/O pads,

● decide the location and number of the power pads,

● decide the type of power distribution, and

● decide the location and type of clock distribution.

The objectives of floorplanning are to minimize the chip area and minimize delay.

www.bldeacet.ac.in Page | 3
21EC71 Advanced VLSI

MEASUREMENT OF DELAY IN FLOORPLANNING


 In floorplanning we wish to predict the interconnect delay before we complete any
routing.
 To predict delay we need to know the parasitics associated with interconnect: the
interconnect capacitance ( wiring capacitance or routing capacitance ) as well as the
interconnect resistance.
 We cannot predict the resistance of the various pieces of the interconnect path since we
do not yet know the shape of the interconnect for a net.
 We estimate interconnect length by collecting statistics from previously routed chips and
analyzing the results. From these statistics we create tables that predict the interconnect
capacitance as a function of net fanout and block size.

FIGURE 16.4 Predicted capacitance. (a) Interconnect lengths as a function of fanout (FO)
and circuit-block size. (b) Wire-load table. There is only one capacitance value for each
fanout (typically the average value). (c) The wire-load table predicts the capacitance and
delay of a net (with a considerable error).

www.bldeacet.ac.in Page | 4
21EC71 Advanced VLSI

FLOORPLANNING TOOLS
 Figure 16.6 (a) shows an initial random floorplan generated by a floorplanning tool. Two
of the blocks, A and C in this example, are standard-cell areas (the chip shown in Figure
16.1 is one large standard-cell area). These are flexible blocks (or variable blocks )
because, although their total area is fixed, their shape (aspect ratio) and connector
locations may be adjusted during the placement step.
 We may force logic cells to be in selected flexible blocks by seeding. Seeding may be
hard or soft. A hard seed is fixed and not allowed to move during the remaining
floorplanning and placement steps. A soft seed is an initial suggestion only and can be
altered if necessary by the floorplanner.

FIGURE 16.6 Floorplanning a cell-based ASIC. (a) Initial floorplan generated by the
floorplanning tool. Two of the blocks are flexible (A and C) and contain rows of
standard cells (unplaced). A pop-up window shows the status of block A. (b) An
estimated placement for flexible blocks A and C. The connector positions are known
and a rat s nest display shows the heavy congestion below block B. (c) Moving blocks
to improve the floorplan. (d) The updated display shows the reduced congestion after
the changes.

 We need to control the aspect ratio of our floorplan because we have to fit our chip into
the die cavity (a fixed-size hole, usually square) inside a package.
 With practice, we can create a good initial placement by floorplanning and a pictorial
display

www.bldeacet.ac.in Page | 5
21EC71 Advanced VLSI

FIGURE 16.7 Congestion analysis. (a) The initial floorplan with a 2:1.5 die aspect ratio.
(b) Altering the floorplan to give a 1:1 chip aspect ratio. (c) A trial floorplan with a
congestion map. Blocks A and C have been placed so that we know the terminal positions
in the channels. Shading indicates the ratio of channel density to the channel capacity.
Dark areas show regions that cannot be routed because the channel congestion exceeds the
estimated capacity. (d) Resizing flexible blocks A and C alleviates congestion.

CHANNEL DEFINITION
During the floorplanning step we assign the areas between blocks that are to be used for
interconnect. This process is known as channel definition or channel allocation .
Figure 16.8 shows a T-shaped junction between two rectangular channels and illustrates
why we must route the stem (vertical) of the T before the bar. The general problem of
choosing the order of rectangular channels to route is channel ordering.

FIGURE 16.8 Routing a T-junction between two channels in two-level metal. The dots
represent logic cell pins. (a) Routing channel A (the stem of the T) first allows us to adjust
the width of channel B. (b) If we route channel B first (the top of the T), this fixes the
width of channel A. We have to route the stem of a T-junction before we route the top.

www.bldeacet.ac.in Page | 6
21EC71 Advanced VLSI

Figure 16.9 shows a floorplan of a chip containing several blocks. Suppose we cut along
the block boundaries slicing the chip into two pieces ( Figure 16.9 a). Then suppose we can
slice each of these pieces into two. If we can continue in this fashion until all the blocks are
separated, then we have a slicing floorplan ( Figure 16.9 b). Figure 16.9 (c) shows how the
sequence we use to slice the chip defines a hierarchy of the blocks. Reversing the slicing
order ensures that we route the stems of all the channel T-junctions first.

FIGURE 16.9 Defining the channel routing order for a slicing floorplan using a
slicing tree. (a) Make a cut all the way across the chip between circuit blocks. Continue
slicing until each piece contains just one circuit block. Each cut divides a piece into two
without cutting through a circuit block. (b) A sequence of cuts: 1, 2, 3, and 4 that
successively slices the chip until only circuit blocks are left. (c) The slicing tree
corresponding to the sequence of cuts gives the order in which to route the channels: 4, 3,
2, and finally 1.
Figure 16.10 shows a floorplan that is not a slicing structure. We cannot cut the chip all
the way across with a knife without chopping a circuit block in two. This means we cannot
route any of the channels in this floorplan without routing all of the other channels first. We
say there is a cyclic constraint in this floorplan. There are two solutions to this problem.
One solution is to move the blocks until we obtain a slicing floorplan. The other solution is to
allow the use of L -shaped, rather than rectangular, channels (or areas with fixed
connectors on all sides a switch box ).

FIGURE 16.10 Cyclic constraints. (a) A nonslicing floorplan with a cyclic constraint
that prevents channel routing. (b) In this case it is difficult to find a slicing floorplan
without increasing the chip area. (c) This floorplan may be sliced (with initial cuts 1 or 2)
and has no cyclic constraints, but it is inefficient in area use and will be very difficult to
route.

www.bldeacet.ac.in Page | 7
21EC71 Advanced VLSI

I/O AND POWER PLANNING


Every chip communicates with the outside world. Signals flow onto and off the chip and
we need to supply power. We need to consider the I/O and power constraints early in the
floorplanning process.

FIGURE 16.12 Pad-limited and core-limited die. (a) A pad-limited die. The number of
pads determines the die size. (b) A core-limited die: The core logic determines the die size.
(c) Using both pad-limited pads and core-limited pads for a square die.

 Special power pads are used for the positive supply, or VDD, power buses (or power
rails ) and the ground or negative supply, VSS or GND.
 Usually one set of VDD/VSS pads supplies one power ring that runs around the pad
ring and supplies power to the I/O pads only.
 Another set of VDD/VSS pads connects to a second power ring that supplies the logic
core.
 We sometimes call the I/O power dirty power since it has to supply large transient
currents to the output transistors. We keep dirty power separate to avoid injecting
noise into the internal-logic power (the clean power).
 I/O pads also contain special circuits to protect against electrostatic discharge
(ESD). These circuits can withstand very short high-voltage (several kilovolt) pulses
that can be generated during human or machine handling.
Figure 16.13 (a) and (b) are magnified views of the southeast corner of our example chip
and show the different types of I/O cells. Figure 16.13 (c) shows a stagger-bond
arrangement using two rows of I/O pads. In this case the design rules for bond wires (the
spacing and the angle at which the bond wires leave the pads) become very important.
Figure 16.13 (d) shows an area-bump bonding arrangement (also known as flip-chip,
solder-bump or C4, terms coined by IBM who developed this technology [ Masleid, 1991])
used, for example, with ball-grid array ( BGA )packages.

www.bldeacet.ac.in Page | 8
21EC71 Advanced VLSI

FIGURE 16.13 Bonding pads. (a) This chip uses both pad-limited and core-limited
pads. (b) A hybrid corner pad. (c) A chip with stagger-bonded pads. (d) An area-bump
bonded chip (or flip-chip). The chip is turned upside down and solder bumps connect the
pads to the lead frame.

www.bldeacet.ac.in Page | 9
21EC71 Advanced VLSI

CLOCK PLANNING
Figure 16.16 (a) shows a clock spine (not to be confused with a channel spine) routing
scheme with all clock pins driven directly from the clock driver. MGAs and FPGAs often use
this fish bone type of clock distribution scheme.

FIGURE 16.16 Clock distribution. (a) A clock spine for a gate array. (b) A clock spine
for a cell-based ASIC (typical chips have thousands of clock nets).
(c) A clock spine is usually driven from one or more clock-driver cells. Delay in the
driver cell is a function of the number of stages and the ratio of output to input
capacitance for each stage (taper). (d) Clock latency and clock skew. We would like to
minimize both latency and skew.

www.bldeacet.ac.in Page | 10
21EC71 Advanced VLSI

PLACEMENT
After completing a floorplan we can begin placement of the logic cells within the
flexible blocks. Placement is much more suited to automation than floorplanning. Thus we
shall need measurement techniques and algorithms.

PLACEMENT TERMS AND DEFINITIONS


CBIC, MGA, and FPGA architectures all have rows of logic cells separated by the
interconnect these are row-based ASICs . Figure 16.18 shows an example of the
interconnect structure for a CBIC. Interconnect runs in horizontal and vertical directions in
the channels and in the vertical direction by crossing through the logic cells. Figure 16.18
(c) illustrates the fact that it is possible to use over-the-cell routing ( OTC routing) in areas
that are not blocked. However, OTC routing is complicated by the fact that the logic cells
themselves may contain metal on the routing layers.

FIGURE 16.18 Interconnect structure. (a) The two-level metal CBIC floorplan
shown in Figure 16.11 b. (b) A channel from the flexible block A. This channel has a
channel height equal to the maximum channel density of 7 (there is room for seven
interconnects to run horizontally in m1). (c) A channel that uses OTC (over-the-cell)
routing in m2.

With two layers of metal, we route within the rectangular channels using the first
metal layer for horizontal routing, parallel to the channel spine, and the second metal layer
for the vertical direction (if there is a third metal layer it will normally run in the horizontal
direction again). The maximum number of horizontal interconnects that can be placed side
by side, parallel to the channel spine, is the channel capacity .

www.bldeacet.ac.in Page | 11
21EC71 Advanced VLSI

PLACEMENT GOALS AND OBJECTIVES


The goal of a placement tool is to arrange all the logic cells within the flexible blocks on
a chip. Ideally, the objectives of the placement step are to
● Guarantee the router can complete the routing step
● Minimize all the critical net delays
● Make the chip as dense as possible

We may also have the following additional objectives:


● Minimize power dissipation
● Minimize cross talk between signals

The most commonly used placement objectives are one or more of the following:
● Minimize the total estimated interconnect length
● Meet the timing requirements for critical nets
● Minimize the interconnect congestion

PLACEMENT ALGORITHMS
There are two classes of placement algorithms commonly used in commercial CAD
tools:

1. constructive placement
a. variations on the min-cut algorithm
b. eigenvalue method
2. iterative placement improvement.

Placement usually starts with a constructed solution and then improves it using an
iterative algorithm.

www.bldeacet.ac.in Page | 12
21EC71 Advanced VLSI

MIN -CUT PLACEMENT METHOD

The min-cut placement method uses successive application of partitioning [Breuer,1977].


The following steps are shown in Figure 16.24 :
1. Cut the placement area into two pieces.
2. Swap the logic cells to minimize the cut cost.
3. Repeat the process from step 1, cutting smaller pieces until all the logic
cells areplaced.

FIGURE 16.24 Min-cut placement. (a) Divide the chip into bins using a grid.
(b) Merge all connections to the center of each bin. (c) Make a cut and swap
logic cells between bins to minimize the cost of the cut. (d) Take the cut piecesand
throw out all the edges that are not inside the piece. (e) Repeat the processwith a
new cut and continue until we reach the individual bins.

www.bldeacet.ac.in Page | 13
21EC71 Advanced VLSI

ITERATIVE PLACEMENT IMPROVEMENT


An iterative placement improvement algorithm takes an existing placement and tries
to improve it by moving the logic cells. There are two parts to the algorithm:
● The selection criteria that decides which logic cells to try moving.

● The measurement criteria that decides whether to move the selected cells.
There are several interchange or iterative exchange methods that differ in their
selection and measurement criteria:

● pairwise interchange,
● force-directed interchange,
● force-directed relaxation, and
● force-directed pairwise relaxation.

FIGURE 16.26 Interchange. (a) Swapping the source logic cell with a destination logic
cell in pairwise interchange. (b) Sometimes we have to swap more than two logic cells
at a time to reach an optimum placement, but this is expensive in computation time.
Limiting the search to neighborhoods reduces the search time.
Logic cells within a distance e of a logic cell form an e-neighborhood. (c) A one-
neighborhood. (d) A two-neighborhood.

FIGURE 16.27 Force-directed placement. (a) A network with nine logic cells.
(b) We make a grid (one logic cell per bin). (c) Forces are calculated as if springs were
attached to the centers of each logic cell for each connection. The two nets connecting
logic cells A and I correspond to two springs. (d) The forces are proportional to the
spring extensions.

www.bldeacet.ac.in Page | 14
21EC71 Advanced VLSI

Without external forces to counteract the pull of the springs between logic cells, the
network will collapse to a single point as it settles. An important part of force-directed
placement is fixing some of the logic cells in position. Normally ASIC designers use the I/O
pads or other external connections to act as anchor points or fixed seeds.

FIGURE 16.28 Force-directed iterative placement improvement. (a) Force-directed


interchange. (b) Force-directed relaxation. (c) Force-directed pairwise relaxation.

www.bldeacet.ac.in Page | 15
21EC71 Advanced VLSI

TIMING-DRIVEN PLACEMENT METHODS


Minimizing delay is becoming more and more important as a placement objective.
There are two main approaches:

1. net based 2. path based

We know that we can use net weights in our algorithms. The problem is to calculate
the weights. One method finds the n most critical paths (using a timing-analysis engine,
possibly in the synthesis tool). The net weights might then be the number of times each net
appears in this list. The problem with this approach is that as soon as we fix (for example)
the first 100 critical nets, suddenly another 200 become critical.

FIGURE 16.29 The zero-slack algorithm. (a) The circuit with no net delays. (b) The
zero-slack algorithm adds net delays (at the outputs of each gate, equivalent to increasing
the gate delay) to reduce the slack times to zero.
With the zero-slack algorithm we simplify but overconstrain the problem. For
example, we might be able to do a better job by making some nets a little longer than the
slack indicates if we can tighten up other nets. What we would really like to do is deal with
paths such as the critical path shown in Figure 16.29 (a) and not just nets . Path-based
algorithms have been proposed to do this, but they are complex and not all commercial
tools have this capability.

www.bldeacet.ac.in Page | 16
21EC71 Advanced VLSI

PHYSICAL DESIGN FLOW


Historically placement was included with routing as a single tool (the term P&R is
often used for place and route). Because interconnect delay now dominates gate delay, the
trend is to include placement within a floorplanning tool and use a separate router.
1. Design entry. The input is a logical description with no physical information.
2. Synthesis. The initial synthesis contains little or no information on any interconnect
loading. The output of the synthesis tool (typically an EDIF netlist) is the input to the
floorplanner.
3. Initial floorplan. From the initial floorplan interblock capacitances are input to the
synthesis tool as load constraints and intrablock capacitances are input as wire-load tables.
4. Synthesis with load constraints. At this point the synthesis tool is able to resynthesize
the logic based on estimates of the interconnect capacitance each gate is driving. The
synthesis tool produces a forward annotation file to constrain path delays in the placement
step.

FIGURE 16.31 Timing-driven floorplanning and placement design flow.


5. Timing-driven placement. After placement using constraints from the synthesis tool,
the location of every logic cell on the chip is fixed and accurate estimates of interconnect
delay can be passed back to the synthesis tool.
6. Synthesis with in-place optimization ( IPO ). The synthesis tool changes the drive
strength of gates based on the accurate interconnect delay estimates from the floorplanner
without altering the netlist structure.
7. Detailed placement. The placement information is ready to be input to the routing
step.

www.bldeacet.ac.in Page | 17
21EC71 Advanced VLSI

ROUTING
Once the designer has floorplanned a chip and the logic cells within the flexible
blocks have been placed, it is time to make the connections by routing the chip.

Routing is usually split into

1. global routing
2. followed by detailed routing

GLOBAL ROUTING
The details of global routing differ slightly between cell-based ASICs, gate arrays,
and FPGAs, but the principles are the same in each case. A global router does not make any
connections, it just plans them. We typically global route the whole chip (or large pieces if it
is a large chip) before detail routing the whole chip (or the pieces).

There are two types of areas to global route:

1. inside the flexible blocks


2. between blocks

GOALS AND OBJECTIVES


The input to the global router is a floorplan that includes the locations of all the
fixed and flexible blocks; the placement information for flexible blocks; and the locations of
all the logic cells. The goal of global routing is to provide complete instructions to the
detailed router on where to route every net. The objectives of global routing are one or more
of the following:

● Minimize the total interconnect length.

● Maximize the probability that the detailed router can complete the routing.

● Minimize the critical path delay.

www.bldeacet.ac.in Page | 18
21EC71 Advanced VLSI

GLOBAL ROUTING METHODS

global
routing

sequential hierarchical
routing routing

order- order whole chip,


bottom-up
independent dependent or highest
approach
routing routing level

One approach to global routing takes each net in turn and calculates the shortest path
using tree on graph algorithms with the added restriction of using the available channels.
This process is known as sequential routing.

As a sequential routing algorithm proceeds, some channels will become more


congested since they hold more interconnects than others. In the case of FPGAs and
channeled gate arrays, the channels have a fixed channel capacity and can only hold a
certain number of interconnects.

There are two different ways that a global router normally handles this problem.
Using order-independent routing, a global router proceeds by routing each net, ignoring
how crowded the channels are. Whether a particular net is processed first or last does not
matter, the channel assignment will be the same. In order-independent routing, after all the
interconnects are assigned to channels, the global router returns to those channels that are
the most crowded and reassigns some interconnects to other, less crowded, channels.

Alternatively, a global router can consider the number of interconnects already


placed in various channels as it proceeds. In this case the global routing is order dependent
the routing is still sequential, but now the order of processing the net will affect the results.

In contrast to sequential global-routing methods, which handle nets one at a time,


hierarchical routing handles all nets at a particular level at once. Rather than handling all
of the nets on the chip at the same time, the global-routing problem is made more tractable
by dividing the chip area into levels of hierarchy. By considering only one level of hierarchy
at a time the size of the problem is reduced at each level. There are two ways to traverse the
levels of hierarchy.

Starting at the whole chip, or highest level, and proceeding down to the logic cells is
the top-down approach. The bottom-up approach starts at the lowest level of hierarchy and
globally routes the smallest areas first.

www.bldeacet.ac.in Page | 19
21EC71 Advanced VLSI

GLOBAL ROUTING BETWEEN BLOCKS

FIGURE 17.4 Global routing for a cell-based ASIC formulated as a graph problem. (a) A
cell-based ASIC with numbered channels. (b) The channels form the edges of a graph. (c)
The channel-intersection graph. Each channel corresponds to an edge on a graph whose
weight corresponds to the channel length

Figure 17.5 shows an example of global routing for a net with five terminals, labeled A1
through F1, for the cell-based ASIC shown in Figure 17.4 . If a designer wishes to use
minimum total interconnect path length as an objective, the global router finds the minimum-
length tree shown in Figure 17.5 (b). This tree determines the channels the interconnects will
use.

FIGURE 17.5 Finding paths in global routing. (a) A cell-based ASIC (from Figure 17.4 )
showing a single net with a fanout of four (five terminals). We have to order the numbered
channels to complete the interconnect path for terminals A1 through F1. (b) The terminals
are projected to the center of the nearest channel, forming a graph. A minimum-length
tree for the net that uses the channels and takes into account the channel capacities. (c)
The minimum-length tree does not necessarily correspond to minimum delay. If we wish to
minimize the delay from terminal A1 to D1, a different tree might be better.

www.bldeacet.ac.in Page | 20
21EC71 Advanced VLSI

BACK-ANNOTATION
After global routing is complete it is possible to accurately predict what the length
of each interconnect in every net will be after detailed routing, probably to within 5 percent.
The global router can give us not just an estimate of the total net length (which was all we
knew at the placement stage), but the resistance and capacitance of each path in each net.
This RC information is used to calculate net delays. We can back-annotate this net delay
information to the synthesis tool for in-place optimization or to a timing verifier to make
sure there are no timing surprises. Differences in timing predictions at this point arise due
to the different ways in which the placement algorithms estimate the paths and the way the
global router actually builds the paths

End of Module 2 Notes


Best of Luck

www.bldeacet.ac.in Page | 21

You might also like