Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
INTRODUCTION
1.1 MOTIVATION
The major objective of floorplanning/placement is to locate the modules of a
circuit into a chip to optimize its area and timing.Floorplanning being the first
stage of VLSI Physical Design is the most suited phase for early optimization of
timing, congestion and routability.Floorplanning thus has a profound impact on
the area, delay, power, and many other design parameters. To ensure an effective
and reliable design, careful and accurate foorplanning is necessary. Due to the
increase in design complexity, circuit size is getting larger in modern VLSI
design. To handle the design complexity, hierarchical design and reuse of IP
modules become popular, which makes floorplanning/placement much more
important than ever.
Further the need to integrate heterogeneous systems or special modules
imposes some placement constraints e.g., the boundary-module constraint which
requires some modules to be placed along the chip boundaries for shorter
connections to pads, the replaced-module constraint which pre-assigns modules to
specific positions etc. These trends make floorplanning/placement much more
important
and
it
is
of
particular
significance
to
consider
the
Chapter 2
LITERATURE SURVEY
2.1 Physical Design
Physical design of a circuit is the phase that precedes the
fabrication of a circuit. In most general terms, physical design refers
to all the synthesis steps succeeding logic design and preceding
fabrication. The performance of the circuit, its area, its yield and its
reliability depend critically on the way of the circuit is physically laid
out.
In an integrated circuit layout, metal and polysilicon are used
to connect two
Points that is electrically equivalent. Both metal and poly lines
introduce wiring
Impedances. Thus a wire can impede a signal from traveling at a
fast speed. The longer the wire, the larger the wiring impedance,
and the longer the delays introduced by the wiring impedance.
When more than one metal layer is used for layout, there is another
source of impedance. If a connection is implemented partly in metal
layer 1 and partly in metal layer 2, via is used at the point of layer
in metal, a contact becomes necessary to perform the layer change.
Contacts and vias introduce a significant amount of impedance,
once again contributing to the slowing down of signals. Layout
critically affects the area of a circuit. There are two components to
the
Area of an integrated circuit, the functional area and the wiring
area. The area taken up by the active elements in the circuit is
known as functional area.
The wires used to interconnect these functional modules
contribute to the wiring area. Just as the affect the performance of
the circuit. A good layout should have strongly connected modules
to be placed close together, so that longer wires are avoided as
2.2.1 Partitioning
A chip may contain several million transistors. Layout of the entire
circuit cannot be handled due to the limitation of memory space as
well as computation power available. Therefore, it is normally
partitioned by grouping the components into blocks (sub-circuits /
modules).The actual partitioning process considers many factors
such as, the size of the blocks, number of blocks, and the number of
interconnections between the blocks. The set of interconnections
objective
of
the
routing
phase
is
to
complete
the
2.2.4 Compaction
Compaction is simply the task of compressing the layout in all
directions such that the total area is reduced. By making the chip
smaller, wire lengths are reduced which in turn reduced the signal
delay between components of the circuit. At the same time smaller
area may imply more chips can be produced on a wafer which in
turn reduces the cost of manufacturing.
2.3 FLOORPLANNING
Floorplanning is a major step in the physical design cycle of VLSI
circuits. It is the step to plan the positions and the shapes of the top-level blocks of
a hierarchical design. The floorplan is a physical description of an ASIC. The
traditional foorplanning problem takes as input a set of modules (blocks), their
widths and heights, and interconnections between them. It tries to find a foorplan
such that the total area, delay, and power are minimized. The input to a
floorplanning step is a hierarchical netlist that describes the interconnection of the
blocks (RAM, ROM, ALU, cache controller, and so on), the logic cells (NAND,
NOR, D flip-flop, and so on) within the blocks; and the logic cell connectors (the
terms terminals, pins, or ports mean the same thing as connectors). The netlist is a
logical description of the ASIC; we have to now set aside spaces (channels) for
interconnect and arrange the cells. Floorplanning is thus a mapping between the
logical description (the netlist) and the physical description (the floorplan).
follows the system partitioning step and is the first step in arranging circuit
blocks on an ASIC. There are many factors to be considered during
floorplanning: minimizing connection length and signal delay between blocks;
arranging fixed blocks and reshaping flexible blocks to occupy the minimum die
area; organizing the interconnect areas between blocks; planning the power,
clock, and I/O distribution. The criterion for optimization may be minimum
interconnect area, minimum total interconnect length, or performance.
The goals of floorplanning are to
Theobjectivesoffloorplanningareto:
minimize delay
minimizethechiparea
i) Shape constraint: since we dont want layout chips as long strips there should be
bounds on aspect ratios of each block. ri < hi /wi < si, where h/w is called aspect
ratio of a block. For hard blocks, only the orientations can be changed
ii) Capacity constraint: The chip is divided into several bins by super-imposing a
grid. Each bin boundary has a capacity (maximum number of nets that can cross
it) associated with it. The objective is to minimize the capacity violation on these
bins.
iii) Timing constraints: Based on the given clock speed the goal is to meet the
critical delay for the longest paths (between two flip-flop boundaries). This delay
may be modeled as the sum of the delays of the nets and gates on the critical
path.Given the input specification, the objective is to find a floorplan which best
meets the given constraints.
iv) Overlap constraints:
Prevent any two blocks from overlapping
v) Routability constraints:
Estimate the routing area required between the blocks
2.3.4 Wirelength estimation
Exact wire length of each net is not known until routing is done. In floorplanning,
even pin positions are not known. The process of identifying pin location is called
pin assignment. A possible wire length estimation method is Center-to-center
estimation, half of the perimeter of the rectangle enclosing all terminals in a net or
minimum rectilinear spanning/Steiner tree.
a)center-to-center estimation
10
Dead space is the space that is wasted; Minimizing area is the same as
minimizing dead space. Dead space percentage is computed as
((A - Ai) / _ Ai) 100%
11
12
(b) rules stating how to manipulate the data in the knowledge base
in order to progress toward a solution, and (c) an inference engine
controlling the application of the rules to the knowledge base.
a.Slicing floorplan
13
14
A
D
15
16
that edge in tree. The permutation p is the label sequence when we traverse the
tree in depth-first search order. The first element in permutation p is the root of
tree.
The following example demonstrates the encoding of an 8-node rooted
ordered tree: Given an 8-node tree shown in Fig. 1, its root node has three sub
trees rooted at a, b and c. We can represent it by (00110100011011, adbcegf).
Starting from the root, we visit node a first and record a bit 0 to T and a label a
to p.Then we visit node d and record a bit 0 to T and a label d to p. On the way
back to the root from nodes d and a, we record two bits 11 to T. Then we visit
sub trees b and c in Sequence, and record the remaining of T and p respectively.
The length of the bit string T is 16.
17
18
19
CHAPTER 3
TRANSITIVE CLOSURE GRAPH
3.1 Introduction
A transitive closure graph-based representation for general floorplans.TCG uses a
horizontal and a vertical transitive closure graphs to describe the horizontal and
vertical relations for each pair of modules.
It combines the advantages of sequence pair, BSG, and B*-tree. Like
sequence pair and BSG, but unlike O-tree, B*-tree, and CBL, TCG satisfies the
four properties of P-admissibility (1) its solution space is finite (2) it guarantees a
unique feasible packing for each representation (3) packing and cost evaluation
can be performed in O (m2) time, and (4) the best evaluated packing in the
solution space corresponds to an optimum placement. Like B*tree, but unlike
sequence pair, BSG, O-tree, and CBL, TCG does not need to construct additional
constraint graphs for the cost evaluation during packing, implying faster runtime.
Further, TCG supports incremental update during operations, and keeps the
information of boundary modules as well as the shapes and the relative positions
of modules in the representation. More importantly, the geometric relation among
modules is transparent not only to the TCG representation but also to its
operations (i.e., the effect of an operation on the change of the geometric relation
is known before packing), facilitating faster convergence to a desired solution. All
these properties make TCG an effective and flexible representation for handling
the general floorplan/placement design problems with various constraints such as
boundary constraints. Compared to O-tree and enhanced O-tree, the runtime
requirements of TCG are much smaller than O-tree and B*-tree.
20
modules bi and bj, bi is said to be diagonally related to bj, if bi is on the left side
of bj and their projections on the x and the y axes do not overlap. In a placement,
every two modules must bear one of the three relations: horizontal relation,
vertical relation, and diagonal relation. To simplify the operations on geometric
relations, we treat a diagonal relation for modules bi and bj as a horizontal one,
unless there exists a chain of vertical relations from bi (bj), followed by the
modules enclosed with the rectangle defined by the two closest corners of bi and
bj, and finally to bj (bi), for which we make bi bj, (bj bi). Figure 3.1 shows a
Placement to Its TCG (ch, cv).
a) Floorplan
21
(a) Augmented Ch
(b) Augmented Cv
Pseudocode
For( i 1; i n; i i + 1)
Xi - ;
Count 0;
S1 {V0};
S2 ;
While (Count n && S1 )
{
For each Vi S1
For each Vj such that (Vi, Vj) E
If (Xj < Xi + dij)
{
Xj < Xi + dij;
S2 S2 U {Vj}
}
S1 S2;
S2 ;
Count
Count+1;
}
If (Count > n)
22
23
24
25
26
27
28
29
nets gives an exact value .The sum of all HPWL, i.e., all nets, gives a value of the
approximate interconnect wire length for a placement solution. Another cost
function parameter widely used is component overlap; as design rules do not
allow components to overlap each other, any instance of such should be
considered as a penalty .a penalty factor is added directly to the wiring length
approximation. A commonly used objective function is a weighted sum of area
and wire length.
cost = A + L, where A is the total area of the packing, L is
the total wire length, and and are constants.
The simplest way to generate a new placement is to move one random
component from one position to another random position while another fairly
simple change is to swap two random components positions. Changing a
components orientation will move its ports resulting in small changes to
interconnect lengths of the nets of which the component is a member. This type of
move is usually only performed when no other types of perturbations yield a new
solution. The key to applying Simulated Annealing to placement is the use of a
cooling schedule which the algorithm follows. As the algorithms greediness is
inversely proportional to the systems temperature, moves may be accepted that
actually allow an increase in a placements cost. Enough time must be spent in the
upper and lower temperatures to allow first a quick arrangement of the system and
a final localized arrangement, respectively. If too much time is spent in the upper
temperatures, processing time will be wasted as many inefficient intermediate
solutions will be accepted, if too much time is spent in the lower temperatures,
processing time will again be wasted due to a tight restriction on accepted moves.
The algorithm is generalized by Figure 3.9.
3.5.5 Simulated annealing algorithm
Algorithm Simulated Annealing
Begin
Temp = Initial_Temp
Current placement = Initial Placement
While (temp! = Final_Temp) do
30
31
32
33
simulated
annealing
is
computationally
34
Chapter 4
TCGS: Combination of TCG and SP
4.1 Introduction
The equivalence of the two most promising P*-admissible representations, TCG
and SP, and integrate TCG with a packing sequence (part of SP) into a new
representation, called TCGS.TCG-S combines the advantages of SP and TCG and
at the same time eliminates their disadvantages. With the property of SP, faster
packing and perturbation schemes are possible. Inherited nice properties from
TCG, the geometric relations among modules are transparent to TCGS (implying
faster convergence to a desired solution), placement with position constraints
becomes much easier, and incremental update for cost evaluation can be realized.
These nice properties make TCG-S a superior representation which exhibits an
elegant solution structure to facilitate the search for a desired floorplan/placement.
TCG-S results in the best area utilization, wirelength optimization, convergence
speed, and stability among existing works and is very flexible in handling
placement with special constraints.
35
Among the existing popular representations, SP, BSG, and TCG are
P*admissible while slicing tree, NPE, O-tree, B*-tree, CBL, and Qsequence are
not. The slicing tree and NPE are intended for slicing floorplans only. Since an
optimal placement could be a non-slicing structure, the two representations are not
P-admissible and thus not P*-admissible (i.e., violation of P*-admissibility
Condition (4)). An O-tree defines only one-dimensional geometrical relation
between compacted modules and thus can obtain the relation in the other
dimension only after packing (i.e., violation of Condition (5)). A B*-tree requires
a placement to be left and/or bottom compacted. However, the space intended for
placing a module may be occupied by previously placed modules during packing,
resulting in a mismatch between the original representation and its compacted
placement. Therefore, it may not be feasible to find a compacted placement
corresponding to the original B*-tree, and thus it is not P-admissible (i.e.,
violation of Condition (2)). CBL and Q-sequence can represent only mosaic
floorplans, in which each region in the floorplan contains exactly one module.
CBL and Q-sequence are not P-admissible because it cannot guarantee a feasible
solution after a perturbation (i.e., violation of Conditions (2) and (4)).
36
37
38
zero weight and connect it to those nodes with zero in-degree, the x coordinate of
each module can be obtained by applying the longest path algorithm on the
resulting directed acyclic graph. Therefore, we have xg = max (xa, xb, xc, xd xe)
However, if we place modules based on the sequence - and maintain a horizontal
and a vertical contours, denoted by RH and RV respectively, for the placed
modules, the number of nodes need to be considered can be reduced. Let RH (RV )
be a list of modules bi s for which there exists no module bj with y j yi ' ( xj xi '
) and xj ' xi ' ( yj ' yi ' ) .Suppose we have packed the modules a ,b,c,d,e based
on the sequence - then, the resulting horizontal contour will be R H < c, e, d >.
Keeping RH we only need to traverse the contour from e, the successor of c, to the
last module e, which have horizontal relations with g. Thus, we have xg = xd .
Packing modules in this way, we only need to consider xe and xd and can get rid
of the computation for a maximum value, leading to a faster packing scheme.
39
horizontal one, unless there exists a chain of vertical relations from bi (bj),
followed by the modules enclosed with the rectangle defined by the two closest
corners of bi and bj, and finally to bj (bi), for which we make bi bj, (bj bi).
(a) A placement
40
longer in the contour. The ordering of nodes in TH (TV) can be obtained by depthfirst search. This process repeats for all modules in -.We have W = x v ( H= y v )
if bv is the module in the resulting T H (TV) with the largest value, where W (H)
denotes the width (height) of the placement.
To pack the first module ba in -, we traverse TH ( TV ), from the root bS (bt
) and insert it to the right child of b S ( bt ) since bs ba ( bt ba ).Therefore, the
first module ba in - is placed at the bottom-left corner i.e.( xa , ya )=(0,0 ) since
bS (bt ) is the last module that is horizontally (vertically) related to ba and xs 0 = ,yt
0 = ( fig 4.2 (a) a balanced binary search tree after ba is inserted into T H ( TV )).
Similarly to pack the second module bb in - , we traverse TH from the root bs
and then its right child since bs ba. Then bb is inserted to the left child of since
bs bb.Because bs is the last module with bs bb in the path, xb = xs0 = .
Similarly, we traverse TV from the root bt and then its right child ba since ba
bt .Then bb is inserted to the right child of ba in TV since ba bb.Therefore, yb =
ya = 1.5 because ba is the last module with ba bb in the path. The resulting
balanced binary search trees after performing tree rotations TH ', TV ' is shown in
41
we can
42
out
43
(ii) Swap
Swapping ni and nj does not change the topologies of Ch and Cv, except that nodes
ni and nj in both Ch and Cv are exchanged. Therefore we only need to exchange bi
and bj in - .Figure 4.4 shows the resulting TCG-S after swapping the nodes n c
and ng shown in Figure 4.3.The modules bc and bg in - are exchanged.
Fig 4.4: The resulting TCG-S after swapping the nodes nc and ng
44
(iii) Reverse
Reverse changes the geometric relation between bi and bj from bi bj (bi bj) to
bj bi (bj bi).To reverse a reduction edge (n i , nj) in one transitive closure graph,
we first delete the edge (ni , nj)from the graph, and then add the edge (n j , ni )to the
graph. To keep Ch and Cv graph feasible, for each node nk fin (nj ) U { nj }and nl
fout (ni) U { ni } in the new graph, we have to keep the edge (nk , nl) in the graph.
If the edge does not exist in the graph, we add the edge to the graph and delete the
corresponding edge (nk, nl) (or (nl, nk)) in the other graph. To make - conform to
the topological ordering of the new C h and Cv, we delete bi from - and insert bi
after bj. For each module bk between bi and bj in - , we shall check whether the
edge (ni, nk) exists in the same graph. We delete bk from - and insert it after the
most recently inserted module.
Figure 4.5 shows the resulting TCG-S after reversing the reduction edge
(nd, ne) of the Cv in Figure 4.4. Since there exists no module between bd and be in
- , we only need to delete bd from - and insert it after be, and the resulting - is
shown in Figure 4.5.
Figure 4.5.The resulting TCG-S after reversing the reduction edge (nd, ne)
45
(iv) Move
Move changes the geometric relation between bi bj (bi bj) to bj bi (bj bi)
to move a reduction edge (ni, nj) from a transitive closure graph G to the other G,
we delete the edge from G and then add it to G. Similar to Reverse, for each node
node nk fin (ni ) U { ni }and nl fout (nj) U { nj } in G , we must move the edge
(nk, nl) to G if the the corresponding edge (nk, nl) (or (nl, nk) ) is in G . Since the
operation changes only the edges in Ch or Cv but not the topological Ordering
among nodes, - remains unchanged. Figure 4.6 shows the resulting TCG-S after
moving the reduction edge (na, ne) from Ch to Cv in Figure 4.5. Notice that the
resulting - is the same as that in Figure 4.6.
Figure 4.6: The resulting TCG-S after moving the reduction edge (na, ne) from
Ch to Cv
46
Boundary Constraint: Given a boundary module bi, it must be placed in one of the
four sides: on the left, on the right, at the bottom or at the top in a chip in the final
packing. If a module bi is placed along the left (right) boundary, the
in-degree (out-degree) of the node ni in Ch equals zero. If a module is placed along
the bottom (top) boundary, the in-degree (out-degree) of ni in Cv equals zero. For
each perturbation, we can guarantee a feasible placement by checking whether the
conditions of boundary modules are satisfied.
47
Chapter 5
Software Implementation
5.1 Software Implementation Phases
In order to study and characterize the floorpaln representation TCG -S when
applied to floorplan optimization, a C++ language program / Linux was
implemented. The program uses the floorplan bench mark circuits file formats as
its input. The software implementation of the SA placement algorithm went
through several phases as given below.
PHASE 1 : Placement to TCG-S
i) TCG-S: TCG + SP
ii) TCG-S: HCG & VCG from initial floor plan
PHASE 2: Solution Perturbation using simulated annealing algorithm.
i) Rotation
ii) Swap
ii) Reverse
iv) Move
PHASE 3 : TCG-S to Placement
i) HCG & VCG to floor plan.
48
circuit.
Components of the structure are block name, type (hard or soft), area, xcord,
ycord, leng, width, minimum and maximum aspect ratios.
ii) Structure which defines nodes in a graph (gerez structure for directed acyclic
graphs).Components of the structure are node name, ch weight, cv weight, longest
path, xcord, ycord, Struct edge *outgoing, indegree, out degree.
iii) Structure which defines edges in a graph components of the structure are
edge index, from, to, next.
An array of these data structures is maintained to hold number of blocks in
a floorplan bench mark circuit. First part is reading the bench mark circuit file and
storing the informations into the data structures defined.
5.3 Benchmarks
In order to provide an algorithm characterization test set the GSRC/MCNC
benchmarks are employed. The benchmark sizes range from 33 to 200k
components across 18 circuits. In the simplest formulation of the block packing
problem, all the blocks are rectangle hard blocks with fixed heights and widths,
and their locations are free to assign within the whole layout region. However, in
real design, the block packing algorithm may have to deal with blocks with more
complex features:
Softblocks
In the early stage of physical design many of the circuit blocks are not yet
designed and are thus flexible (soft) in shape. For these soft blocks, the
block packing algorithm not only needs to determine their locations, but
also needs to assign specific shapes to them.
Rectilinearblocks
as some of the circuit blocks come from design re-use, their shapes are not
necessarily rectangle.Therefore the block packing algorithm should be able
to handle the arbitrarily shaped rectilinear blocks.
Pre-placedblocks
in some cases, the locations of some blocks may be fixed, or a region may
49
50
51
52
cases the list of nets to be updated is deleted as this point represents the end of the
move attempt.
53
CHAPTER 6
RESULTS
6.1 Organization
This chapter is organized to present the experimental results of software
implementation of floorplan representation TCG-S, and analys its performance
with other representations. Based on a simulated annealing method the TCG-S
representation has been implemented in the C++ programming language on Linux
platform. Based on the five commonly used MCNC and GSRC benchmark
circuits . The results summarizes the area utilization of the the two Floorplan
Representations TCG and TCG-S.
6.2 TCG
Circuit Name
No. of Blocks
Area Utilization
(%)
Apte
Hp
Xerox
Ami33
Ami49
N10
N30
N50
N200
N300
9
11
10
33
49
10
30
50
200
300
79
80
77
85
86
82
83
86
87
90
54
55
6.4 TCG-S
Circuit Name
No. of Blocks
Area
Utilization
(%)
Apte
9
79
Hp
11
80
Xerox
10
78
Ami33
33
85
Ami49
49
87
N10
10
84
N30
30
83
N50
50
88
N200
200
88
N300
300
91
Table 6.2 Summarization of area utilization for various bench
mark circuits TCGS
56
Number of Modules: 9
57
58
59
60
61
62
63
CHAPTER 7
CONCLUSIONS
This work began by implementing the floorplan representations TCG (transitive
closure graph) and SP (sequence pair) and then combining these to representations
a new representation TCG-S is developed. Simulated Annealing algorithm has
been designed for solution perturbation. An optimal solution with respect to
component placement and minimization of floorplan area is obtained. This step
allowed insight to the intricacies of the Simulated Annealing algorithm, leading to
a novel approach to model a floorplan. Area optimization as a primary parameter,
experiments based on a set of commonly used MCNC benchmarks show that
TCG-S results in the best area utilization and convergence speed.
64
CHAPTER 8
FUTURE DEVELOPMENTS
The presented annealing-based optimization considers only area, however, this
work can extend to other objectives like wirelength minimization and placement
with Pre-placed Modules.
7.1 Wirelength Minimization:
Global interconnect is commonly recognized as a key factor for designing highperformance integrated circuits, as VLSI process technology migrates into deep
submicron (DSM) dimensions and operates in giga-hertz clock frequencies. By
using a wide range of interconnect synthesis and optimization techniques, such as
topology optimization, buffer insertion, layer assignment, wire sizing, and wire
spacing, the performance of a global interconnect could be improved by a factor
of 5 or more. As the global interconnects are largely determined by floorplanning,
it becomes critical for floorplanning engines to be able to handle efficient
interconnect planning and optimizations, so that the overall timing and design
convergence can be achieved.
7.2 TCG-S with Pre-placed Modules
The placement with pre-placed modules is to place a set of prespecified modules
at the designated locations of a chip.
Pre-placed Constraint: Given a module bi with a fixed coordinate (xi,yi)
and an orientation, bi must be placed at the designated location with the same
orientation in the final packing.
65
BIBLIOGRAPHY
[1] J.-M. Lin and Y.-W. Chang, ``TCG-S: Orthogonal Coupling of P*-admissible
Representations for General Floorplans," IEEE Trans. Computer-Aided Design,
Vol. 24, No. 6, pp. 968--980, June 2004.
[2] Jai-Ming Lin and Yao-Wen, TCG: A Transitive Closure Graph-Based
Representation for General Floorplans, IEEE Transactions on Very Large Scale
Integration Systems, 2003.
[5] P.-N. Guo, C.-K. Cheng, and T. Yoshimura, An O-Tree representation of nonslicing floorplan and its applications,Proc.DAC, pp. 268273, 1999.
[6] Sadiq M. Sait and Habib Youssef, VLSI Physical Design Automation- Theory
and Practice, IEEE Press.
[7] Naveed Sherwani, Algorithms for VLSI Physical Design Automation,
Kluwer Academic publishers, 1995
[8] Floorplanning in VLSI avalable online: www.manchester.ac.uk.
[9]design optimization avalable online:www.me.mtu.edu
[10] VLSI Design Automation avalable online: www.mountains.ece.umn.edu
[11] Floorplanning by Dinesh Bhatia available online: www.dallas.edu
[12] Floorplanning by Professor Lei He available online: www.ee.ucla.edu
[13] Recent Development in VLSI Floorplan Representations available online:
www.ee.nthu.edu.
[14] CAD for VLSI Simulated Annealing Algorithm available online:
www.ece.wisc.edu
[15] Physical Design. Floorplan available online: www.sharif.edu
[16] VLSI Placement avalable online: www.ee.ucla.edu
[17] VLSI CAD avalable online: www.vlsicad.ucsd.edu
[18] A Thesis Submitted Modeling of a Hardware VLSI Placement System:
Accelerating,
the
Simulated
Annealing
www.ritdml.rit.edu
66
Algorithm
available
online: