9 Report 1

Chapter 1
INTRODUCTION
1.1 MOTIVATION
The major objective of floorplanning/placement is to locate the modules of a
circuit into a chip to optimize its area and timing.Floorplanning being the first
stage of VLSI Physical Design is the most suited phase for early optimization of
timing, congestion and routability.Floorplanning thus has a profound impact on
the area, delay, power, and many other design parameters. To ensure an effective
and reliable design, careful and accurate foorplanning is necessary. Due to the
increase in design complexity, circuit size is getting larger in modern VLSI
design. To handle the design complexity, hierarchical design and reuse of IP
modules become popular, which makes floorplanning/placement much more
important than ever.
Further the need to integrate heterogeneous systems or special modules
imposes some placement constraints e.g., the boundary-module constraint which
requires some modules to be placed along the chip boundaries for shorter
connections to pads, the replaced-module constraint which pre-assigns modules to
specific positions etc. These trends make floorplanning/placement much more
important
and
it
is
of
particular
significance
to
consider
the
floorplanning/placement with various constraints. To cope with these challenges it

is desired to develop an efficient and effective floorplan representation that can
model the geometric relations among regular as well as constrained modules.
The realization of Floorplanning relies on a representation which describes
geometric relations among modules. The representation has a great impact on the
feasibility and complexity of floorplan designs. Thus it is of particular
significance to develop an efficient, effective, and flexible representation for
floorplan designs.
1.2 ORGANIZATION OF THE REPORT

This report is organized as follows. The chapter 1 describes the introduction of the
thesis work. Chapter 2 formulates the floorplan/placement design problem.
Chapter 3 presents the procedures to derive a TCG from a placement and construct
a placement from a TCG. Chapter 4 introduces the operations to perturb a TCG.
Experimental results are reported in Section 5. Finally, conclusion of work and
discussion on future research directions in Section6.
Chapter 2
LITERATURE SURVEY
2.1 Physical Design
Physical design of a circuit is the phase that precedes the
fabrication of a circuit. In most general terms, physical design refers
to all the synthesis steps succeeding logic design and preceding
fabrication. The performance of the circuit, its area, its yield and its
reliability depend critically on the way of the circuit is physically laid
out.
In an integrated circuit layout, metal and polysilicon are used
to connect two
Points that is electrically equivalent. Both metal and poly lines
introduce wiring
Impedances. Thus a wire can impede a signal from traveling at a
fast speed. The longer the wire, the larger the wiring impedance,
and the longer the delays introduced by the wiring impedance.
When more than one metal layer is used for layout, there is another
source of impedance. If a connection is implemented partly in metal
layer 1 and partly in metal layer 2, via is used at the point of layer
in metal, a contact becomes necessary to perform the layer change.
Contacts and vias introduce a significant amount of impedance,
once again contributing to the slowing down of signals. Layout
critically affects the area of a circuit. There are two components to
the
Area of an integrated circuit, the functional area and the wiring
area. The area taken up by the active elements in the circuit is
known as functional area.
The wires used to interconnect these functional modules
contribute to the wiring area. Just as the affect the performance of
the circuit. A good layout should have strongly connected modules
to be placed close together, so that longer wires are avoided as
much as possible. Similarly, a good layout will have as few vias as

possible. Thus the area of the circuit has direct influence on the
yield of the manufacturing process. We define yield to be the
number of chips that are detect free in a batch of manufactured
chips. The larger the chip area, the lower the yield. A low yield
would mean high production cost, which in turn would increase the
selling cost of the chip.
The reliability of the chip is also influenced by the layout. For
instance, vias are the sources of unreliability, and a layout which
has a large number of vias is more likely to have defects. Further,
the width of a metal wire must be chosen appropriately by the
layout program to avoid metal migration. If a thin metal wire carries
a large current, the excessive current density may cause wearing
away of metal, tapering the wire slowly result in the open circuit.
Hence floorplanning, placement and routing hold to be key stages in
the layout for any circuit. As these parameters directly reflects its
performance over the yield and reliability of circuit. In this thesis a
study of various approaches for floorplanning is performed.
2.2 Physical Design Cycle

Physical design is the phase that precedes fabrication. The different
stages of the physical design cycle are shown in the Figure: 2.1
2.2.1 Partitioning
A chip may contain several million transistors. Layout of the entire
circuit cannot be handled due to the limitation of memory space as
well as computation power available. Therefore, it is normally
partitioned by grouping the components into blocks (sub-circuits /
modules).The actual partitioning process considers many factors
such as, the size of the blocks, number of blocks, and the number of
interconnections between the blocks. The set of interconnections
required is referred to as a netlist. The output of partitioning is a set

of blocks and the interconnections required between the blocks.
2.2.2 Floorplanning and Placement

This step is concerned with selecting good layout alternatives for
each block, as well as the entire chip. The area of each block can be
estimated after partitioning and is based approximately on the
number and the type of components in that block.Floorplanning is a
critical step, as it sets up the ground work for a good layout.
However, it is computationally quite hard. During placement, the
blocks are exactly positioned on the chip. The goal of placement is
to find a minimum area arrangement for the blocks that allows
completion of interconnections between the blocks, while meeting
the performance constraints.
Fig.2.1 VLSI Physical design cycle

2.2.3 Routing
The
objective
of
the
routing
phase
is
to
complete
the
interconnections between blocks according to the specified netlist.

First, the space not occupied by the blocks (called the routing
space) is partitioned into rectangular regions called channels and
switch boxes. The goal of the router is to complete all circuit
connections using shortest possible wire length and using only the
channel and switch boxes. This is usually done in two phases,
referred to as the Global Routing and Detailed Routing phase. For
each wire, the global router finds a list of channels and switchboxes
which are to be used as a passageway for that wire. Global routing

is followed by detailed routing which completes point to point
connections between pins on the blocks. Detailed routing and
switchbox routing, and is done for each channel and switch box.
2.2.4 Compaction
Compaction is simply the task of compressing the layout in all
directions such that the total area is reduced. By making the chip
smaller, wire lengths are reduced which in turn reduced the signal
delay between components of the circuit. At the same time smaller
area may imply more chips can be produced on a wafer which in
turn reduces the cost of manufacturing.
2.2.5 Extraction and Verification:

Design Rule Checking is the process which verifies the design rules
imposed by the fabrication process. After checking the layout for
design rule violations and removing the design Rule violations, the
functionality of the layout are verified by Circuit Extraction. This is a
reverse engineering process and generate the circuit representation
from the layout. Extracted description is compared with circuit
description to verify its correctness. This process is called Layout
Versus Schematics (LVS) verification. Geometric information is
extracted to compute Resistance and Capacitance. This allows to
accurately calculate the timing of each component including
interconnect. This process is called Performance Verification.
Physical design is iterative in nature and many steps such as
global routing and channel routing are repeated several times to
obtain better layout. In addition results obtained in a step depend
on the quality of solution obtained in earlier steps. Finally after
meeting all the specifications of the physical design cycle the design
is sent for fabrication.
2.3 FLOORPLANNING
Floorplanning is a major step in the physical design cycle of VLSI
circuits. It is the step to plan the positions and the shapes of the top-level blocks of
a hierarchical design. The floorplan is a physical description of an ASIC. The
traditional foorplanning problem takes as input a set of modules (blocks), their
widths and heights, and interconnections between them. It tries to find a foorplan
such that the total area, delay, and power are minimized. The input to a
floorplanning step is a hierarchical netlist that describes the interconnection of the
blocks (RAM, ROM, ALU, cache controller, and so on), the logic cells (NAND,
NOR, D flip-flop, and so on) within the blocks; and the logic cell connectors (the
terms terminals, pins, or ports mean the same thing as connectors). The netlist is a
logical description of the ASIC; we have to now set aside spaces (channels) for
interconnect and arrange the cells. Floorplanning is thus a mapping between the
logical description (the netlist) and the physical description (the floorplan).
Fig.2.2 Typical Floor Plan of a chip

2.3.1 Floorplanning Goals and Objectives
Floorplanning being the first stage of VLSI Physical Design is the most suited
for early optimization of timing, congestion and routability. Floorplanning
follows the system partitioning step and is the first step in arranging circuit
blocks on an ASIC. There are many factors to be considered during
floorplanning: minimizing connection length and signal delay between blocks;
arranging fixed blocks and reshaping flexible blocks to occupy the minimum die
area; organizing the interconnect areas between blocks; planning the power,
clock, and I/O distribution. The criterion for optimization may be minimum
interconnect area, minimum total interconnect length, or performance.
The goals of floorplanning are to
arrange the blocks on a chip,
decide the location of the I/O pads,
decide the location and number of the power pads,
decide the type of power distribution, and
Decide the location and type of clock distribution.
Theobjectivesoffloorplanningareto:
minimize delay
minimizethechiparea
2.3.2 Floorplan Problem Definition

The major objective of floorplanning/placement is to locate the modules of a
circuit into a chip to optimize its area and timing the input to the Floorplaning
algorithm is a circuit C (M, N), where m is the set of modules in the floorplanning
system and N is the set of nets defining the connectivity among these modules.
Modules can be soft modules and hard modules. A soft module is a module whose
width and height can be changed as long as the aspect ratio is within a given range
and the area is as given. A hard module is a module whose width and height are
fixed.
2.3.3 User Defined Constraints
i) Shape constraint: since we dont want layout chips as long strips there should be
bounds on aspect ratios of each block. ri < hi /wi < si, where h/w is called aspect
ratio of a block. For hard blocks, only the orientations can be changed
ii) Capacity constraint: The chip is divided into several bins by super-imposing a
grid. Each bin boundary has a capacity (maximum number of nets that can cross
it) associated with it. The objective is to minimize the capacity violation on these
bins.
iii) Timing constraints: Based on the given clock speed the goal is to meet the
critical delay for the longest paths (between two flip-flop boundaries). This delay
may be modeled as the sum of the delays of the nets and gates on the critical
path.Given the input specification, the objective is to find a floorplan which best
meets the given constraints.
iv) Overlap constraints:
Prevent any two blocks from overlapping
v) Routability constraints:
Estimate the routing area required between the blocks
2.3.4 Wirelength estimation
Exact wire length of each net is not known until routing is done. In floorplanning,
even pin positions are not known. The process of identifying pin location is called
pin assignment. A possible wire length estimation method is Center-to-center
estimation, half of the perimeter of the rectangle enclosing all terminals in a net or
minimum rectilinear spanning/Steiner tree.
a)center-to-center estimation
b)half of the perimeter
Fig.2.3 Wire Length Estimation

2.3.5 Dead space
10
Dead space is the space that is wasted; Minimizing area is the same as
minimizing dead space. Dead space percentage is computed as
((A - Ai) / _ Ai) 100%
11
2.4 Approaches to Floorplanning:

Several approaches have been reported to tackle the floorplanning
problem. The Reported approaches belong to three general classes:
Constructive
Iterative and
Knowledge based.
The constructive algorithms attempt to build a feasible solution
by starting from a
seed module; then other modules are selected one (or group) at a
time and added to the partial floorplan. This process continues until
all the modules have been selected.
Among the approached that fall into this class are cluster growth,
partitioning and
Slicing, connectivity clustering, geometric approach, mathematical
programming, and
Rectangular dualization.
The Iterative techniques start from an initial floorplan. Then this
floorplan
Undergoes a series of perturbations until a feasible floorplan is
obtained or no more
Improvements can be achieved. Typical iterative techniques which
have been
Successfully applied to Floorplaning are simulated annealing and
genetic algorithm.
The knowledge-based approach has been applied to several
design automation
Problems including cell generation and layout, circuit extraction,
routing, and
floorplanning. In this approach, a knowledge expert system is
implemented which
Consists of three basic elements: (a) a knowledge base that
contains data describing the floorplan problem and its current state,
12
(b) rules stating how to manipulate the data in the knowledge base
in order to progress toward a solution, and (c) an inference engine
controlling the application of the rules to the knowledge base.
2.5 Floorplan structures

The geometrical relationship among the blocks is commonly specified by a
rectangular dissection of the floorplan region. The floorplan region is first
dissected into rectangular rooms and each block is then mapped to a different
room. In order to restrict the size of the solution space, three different ways of
dissection are proposed. The corresponding floorplanning structures are called
slicing, mosaic and general floorplan.Slicing floorplan is a rectangular dissection
that can be obtained by recursively cutting a rectangle horizontally or vertically
into two smaller rectangles. Otherwise it is a non-slicing floorplan as shown in fig
2.4
a.Slicing floorplan
b. Non Slicing floorplan
Fig.2.4 Floorplan structures
2.6 Floorplan Representation

A floorplan representation is usually used to represent the geometrical
relationships among the blocks.The floorplan representation is perturbed
repeatedly by the stochastic techniques to search for a good floorplan.The run
time and the quality of the solutions depend strongly on the size of the solution
space, i.e., the number of possible representation.
2.6.1 Slicing Floorplans
13
2.6.1.1 Slicing Tree

The first proposed slicing floorplan representation is using a binary tree
representation called slicing tree. Each leaf of the slicing tree corresponds to a
block and each internal node represents a vertical or horizontal merge operation
on the two descendents. One slicing floorplan may correspond to more than one
slicing tree. Later redundancy was identified in slicing tree.
(a) Slicing floorplan
(b) slicing tree
Fig.2.5 Slicing floorplan

2.6.1.2 Polish Expression (PE)
A string of symbols obtained by traversing a binary tree in post-order called
polished expression., to present a slicing floorplan.Left child of a V-cut in the tree
represents the left slice in the floorplan.Left child of an H-cut in the tree
represents the top slice in the floorplan.Example:
14
Fig.2.6 Polish Expression

Problems with PE:multiple representations for some slicing trees (When
more than one cut in one direction cut a floorplan), larger solution space.
2.6.2 Floorplan Representations in non slicing structure
2.6.2.1 Sequence Pair (SP)
A sequence pair for a set of modules is a pair of sequences of the module names.
A sequence pair imposes a horizontal/vertical constraint for every pair of modules
as follows: Example:
(ab,ab)=>a should be placed to the left of b
(ba,ab) =>a should be placed below b
A
D
15
SP1= (ABCDFE, FADEBC), SP2= (ABCDFE, FADBEC)

Fig.2.7 Sequence Pair
Sp can handle non-slicing structure and it is very flexible in representation
however it is time-consuming Sequence. Since the solution space is large pair,
Harder to transform between a sequence pair and a placement .moreover Sequence
pair cannot handle soft modules directly.
2.6.2.2 Bounded-Sliceline Grid (BSG)
In the bounded-sliceline grid (BSG) representation blocks are randomly placed in
a special n-by-n grid. The corresponding size of the solution space is even larger
than that of SP. The huge solution spaces of BSG restrict the applicability of these
representations in large floorplan problems. In a BSG modules are assigned into n
x n rooms. Edge weights of Gh (Gv) denote the widths (heights) of modules.
Fig.2.8 Bounded-Sliceline Grid

2.6.2.3 O-Tree
An O-tree is a rooted directed tree in which the order of the sub trees T1... Tm is
important. The order of the sub trees T1... Tm determines the DFS order when we
traverse the tree. To encode a rooted ordered tree with n nodes, we need a 2(n-1)bit string T to identify the branching structure of tree, and a permutation p as the
labels of n nodes. The bit string T is a realization of the tree structure. We write a
0 for a traversal which descends an edge and a 1 when it subsequently ascends
16
that edge in tree. The permutation p is the label sequence when we traverse the
tree in depth-first search order. The first element in permutation p is the root of
tree.
The following example demonstrates the encoding of an 8-node rooted
ordered tree: Given an 8-node tree shown in Fig. 1, its root node has three sub
trees rooted at a, b and c. We can represent it by (00110100011011, adbcegf).
Starting from the root, we visit node a first and record a bit 0 to T and a label a
to p.Then we visit node d and record a bit 0 to T and a label d to p. On the way
back to the root from nodes d and a, we record two bits 11 to T. Then we visit
sub trees b and c in Sequence, and record the remaining of T and p respectively.
The length of the bit string T is 16.
Fig.2.9 O-Tree and placement

The solution space is smaller for O-tree ,transformation between
representation and placement takes only linear time and O-tree can be encoded by
fewer bits than sequence pair and BSG .however O-tree is less flexible than
BSG/sequence pair in representation, tree structure is irregular, harder for
implementation ,need to encode and operate on module sequence, need to
transform between the tree and its placement during processing, inserting
positions are limited, might deviate from the optimal during solution perturbation.
2.6.2.4 B*-tree
A B*-tree is an ordered binary tree whose root corresponds to the module on the
bottom-left corner. Similar to the DFS procedure, we construct the B*-tree for an
17
admissible placement p in a recursive fashion. We make n 0 the root of tree since

b0 is on the bottom-left corner. Constructing the left sub tree of n 0 recursively, we
make n7 the left child of n0. Since the left child of n7 does not exist, we then
construct the right sub tree of n7 (which is rooted by n8). The construction is
recursively performed in the DFS order. After completing the left sub tree of n0,
the same procedure applies to the right sub tree of n0.
Fig.2.10 B*-Tree and placement

Binary-tree based representation is efficient and flexible to deal with hard,
pre-placed, soft, and rectilinear modules, smaller encoding cost for B*-tree.
Except for handling soft modules, it can transform a tree to its placement during
processing, which takes only linear time. A B* tree Can evaluate area cost
incrementally and the solution space is smaller.
2.6.2.5 Corner Block List (CBL)

The corner block list is constructed from the record of a recursive corner block
deletion. For each block deletion, we keep a record of block name, corner block
orientation, and number of T-junctions uncovered. At the end of deletion
iterations, we concatenate the data of these three items in a reversed order. Thus,
we have a sequence S of block names, a list L of orientations, and a list T of Tjunction information. The three topple (S, L, T) is called a corner block list. We
use the floorplan of figure 2.10 as an example.
18
Fig.2.11 CBL and placement

First, block d is deleted. d is vertical oriented and there is one T- junction
attached at the bottom edge of block d.Block a, b, g, e, c, f are deleted
successively. We concatenate these record in a reverse order of deletion and derive
a corner block list (S, L, T), where S= (fcegbad), L= (001100), and T=
(001010010).
CBL a new effective representation for non-slicing floorplan has the same
computing complexity as that of binary tree of slicing structure. However, it can
not only represent all floorplans with slicing structure, but also represent nonslicing floorplans. The time complexity of CBL is much lower than the other nonslicing structures such as SP and BSG. CBL has almost the same time and space
complexity as O-tree; however, it is better suited for floorplan optimization with
various size configurations of each block.
19
CHAPTER 3
TRANSITIVE CLOSURE GRAPH
3.1 Introduction
A transitive closure graph-based representation for general floorplans.TCG uses a
horizontal and a vertical transitive closure graphs to describe the horizontal and
vertical relations for each pair of modules.
It combines the advantages of sequence pair, BSG, and B*-tree. Like
sequence pair and BSG, but unlike O-tree, B*-tree, and CBL, TCG satisfies the
four properties of P-admissibility (1) its solution space is finite (2) it guarantees a
unique feasible packing for each representation (3) packing and cost evaluation
can be performed in O (m2) time, and (4) the best evaluated packing in the
solution space corresponds to an optimum placement. Like B*tree, but unlike
sequence pair, BSG, O-tree, and CBL, TCG does not need to construct additional
constraint graphs for the cost evaluation during packing, implying faster runtime.
Further, TCG supports incremental update during operations, and keeps the
information of boundary modules as well as the shapes and the relative positions
of modules in the representation. More importantly, the geometric relation among
modules is transparent not only to the TCG representation but also to its
operations (i.e., the effect of an operation on the change of the geometric relation
is known before packing), facilitating faster convergence to a desired solution. All
these properties make TCG an effective and flexible representation for handling
the general floorplan/placement design problems with various constraints such as
boundary constraints. Compared to O-tree and enhanced O-tree, the runtime
requirements of TCG are much smaller than O-tree and B*-tree.
3.2 From a Placement to Its TCG

For two non-overlapped modules bi and bj, bi is said to be horizontally
(Vertically) related to bj, denoted by bi bj (bi bj), if bi is on the left (bottom)
side of bj and their projections on the y (x) axis overlap. For two non-overlapped
20
modules bi and bj, bi is said to be diagonally related to bj, if bi is on the left side
of bj and their projections on the x and the y axes do not overlap. In a placement,
every two modules must bear one of the three relations: horizontal relation,
vertical relation, and diagonal relation. To simplify the operations on geometric
relations, we treat a diagonal relation for modules bi and bj as a horizontal one,
unless there exists a chain of vertical relations from bi (bj), followed by the
modules enclosed with the rectangle defined by the two closest corners of bi and
bj, and finally to bj (bi), for which we make bi bj, (bj bi). Figure 3.1 shows a
Placement to Its TCG (ch, cv).
a) Floorplan
b) Horizontal closure graph c) Vertical closure graph

Fig.3.1 floorplan to TCG
3.3 From a TCG to its placement

Given a TCG, its corresponding placement can be obtained in by performing a
well-known longest path algorithm called Bellman Ford Algorithm on TCG.To
facilitate the implementation of the longest path algorithm, we augment the given
two closure graphs as follows. We introduce two special nodes with zero weights
for each closure graph, the source ns and the sink nt, and construct an edge from
ns to each node with in-degree equal to zero, and also from each node with outdegree equal to zero to nt. Figure 3.2 shows the augmented VCG and HCG for
the TCG shown in Figure 3.1 b and 3.1 c.
21
(a) Augmented Ch
(b) Augmented Cv
Fig.3.2 Augmented TCG.

3.3.1 The Bellman Ford Algorithm :
Pseudocode
For( i 1; i n; i i + 1)
Xi - ;
Count 0;
S1 {V0};
S2 ;
While (Count n && S1 )
{
For each Vi S1
For each Vj such that (Vi, Vj) E
If (Xj < Xi + dij)
{
Xj < Xi + dij;
S2 S2 U {Vj}
}
S1 S2;
S2 ;
Count
Count+1;
}
If (Count > n)
22
Error (positive cycle);

Let Lh (ni) (Lv (ni)) is the length of the longest path from ns to ni in the
augmented Ch (Cv). Lh (ni) (Lv (ni) can be determined by performing the single
source longest path algorithm on the augmented Ch (Cv) The coordinate (xi, yi) of
a module bi is given by (Lh (ni), Lv (ni_)).Since the respective width and height
of the placement for the given TCG are Lh (nt) and Lv (nt), the area of the
placement is given by Lh (nt).Lv (nt).
3.4 Floorplanning Algorithm

A simulated annealing based algorithm is used for Solution Perturbation. Given an
initial solution represented by a TCG, the algorithm perturbs the TCG to obtain a
new TCG.
3.4.1 Solution Perturbation
Four operations can be applied to perturb a TCG to obtain a new TCG
3.4.1.1 Rotate
To rotate a module bi, we only need to exchange the weights of the corresponding
node ni in Ch and Cv.Figure.3.3 (b) shows the resulting Ch, Cv, and placement
after rotating the module d shown in Figure 3.3(a). The weights associated with
the node nd in Ch and Cv has been exchanged.
Fig 3.3(a) initial configuration of TCG
23
Fig 3.3 (b) rotate module d

3.4.1.2 Swap
To swap two nodes ni and nj, we only need to exchange two nodes in both Ch and
Cv. Fig 3.4(b) shows the resulting Ch, Cv, and placement after swapping the
nodes na and nb shown in Fig 3.4(a) Notice that the nodes na and nb in both Ch
and Cv have been exchanged.
Fig 3.4(a) TCG before swap
24
Fig 3.4 (b) swap na and nb

3.4.1.2 Reverse
The Reverse operation reverses the direction of a reduction edge (ni, nj) in a
transitive closure graph, which corresponds to changing the geometric relation of
the two modules bi and bj. For two modules bi and bj ,bi bj (bi bj), if there
exists a reduction edge (ni , nj ) in Ch (Cv); after reversing the edge (ni, nj), we
have the new geometric relation bj bi (bj bi), Therefore, the geometric
relation among modules is transparent not only to the TCG representation but also
to the Reverse operation (i.e., the effect of such an operation on the change of the
geometric relation is known before packing); this property can facilitate the
convergence to a desired solution.
Fig 3.5(a) TCG before reverse
25
Fig 3.5 (b) reverse (nc, ne)

3.4.1.3 Move
The Move operation moves a reduction edge (ni, nj) in a transitive closure graph
to the other, which corresponds to switching the geometric relation of the two
modules bi and bj between a horizontal relation and a vertical one. For two
modules bi and bj , bi bj (bi bj)if there exists a reduction edge (ni , nj) in Ch
(Cv); after moving the edge (ni, nj) to Cv (Ch), we have the new geometric
relation bi bj (bi bj). Therefore, the geometric relation among modules is also
transparent to the Move operation.
Fig 3.6(a) TCG before move
26
Fig 3.6(b) move (nb, ne)
3.5 Simulated Annealing physical model

Annealing is a mechanical process in which material is slowly cooled allowing the
molecules to arrange themselves in such a way that the material is less strained
thereby making it more stable. If materials such as glass or metal are cooled too
quickly its constituent molecules will be under high stress lending it to failure
(breaking) if further thermal or physical shocks are encountered. Slowing the
cooling of the material allows each molecule to move into a place it feels most
comfortable, i.e., less stress. As the material is kept at a high temperature the
molecules are able to move around quite freely thus reducing stress on a large
scale, indeed if the material is made too hot it will move into the liquid state
allowing free movement of the molecules. As the material is cooled the molecules
are not able to move around as freely but still move limited distances reducing
stress in regional areas. The result is a material with significantly less internal
stress and resistant to failure due to external shock. If one equates molecules to
components and the substance to the overall design of an electronic circuit,
Simulated Annealing can be applied to efficiently place the system onto the target
die.
27
Figure 3.7 Molecules Movements per Temperature Region

3.5.1 Simulated annealing
Simulated annealing is a generic probabilistic search algorithm for finding a good
approximation of the global optimum of a given objective function in a large
discrete search space. During each search step the annealing algorithm replaces
the current solution by a randomly selected "neighbor" solution. The neighbor is
chosen with a probability that depends on the difference between the
corresponding objective function values and on a global control parameter T,
typically referred to as temperature. Starting from a high value, T is gradually
decreased during the search such that the current solution changes almost
randomly when T is large, but the moves become increasingly biased towards
better solutions as T approaches zero. The possibility of uphill Moves for larger
values of T ensures probabilistically that the search climbs out of local minimums.
Figure 3.8: Series of Neighboring Solutions Containing a Local Minimum
28
3.5.2 Application to Combinatorial Optimization

Simulated Annealing (SA) is a stochastic algorithm. As a Genetic Algorithm
attempts to model evolution as a way to select an optimal solution, Simulated
Annealing looks to the model of molecules in a heated mass and the way they
behave as they cool to form a structured solid. The aim of the algorithm is to
reduce the energy of the system through a slow cooling. As applied to placement,
system energy is measured in the inefficiency (cost) of the placement; a poor
placement will cause a system to have higher energy. This analogy is drawn from
molecules in the cooling mass to components in the placement; a quickly cooled
mass is quite fragile as a poorly placed design is inefficient due to the molecules
and components, respectfully, being arranged in such a way that they experience
internal tensions amongst each other as they try to move to regions which would
lower their energy. After the material is cooled, the molecules (components) are
frozen in these non-optimal positions, resulting in overall fragility of the system.
3.5.3 SA & Placement
VLSI placement in general consists of rectilinear components being targeted onto
a rectangular or square die area in such a way that the interconnect wire length is
minimized. In general, components are free to move to any location on the die and
the interconnect wire length is calculated by measuring and summing the length
of wire used to connect each net, or connection of a group of ports .
3.5.4 Cost Function
The initial placement may simply be given as a random placement. The cost
function is usually comprised of several parameters each measuring a different
aspect of the current solution. A very simple and widely used cost function
parameter is the interconnect wire length of a placement solution , this can be
easily approximated using the bounding box method .This wire length estimation
method draws a bounding box around all ports in a given net, half the perimeter of
this box is taken as the nets interconnect length approximation. The halfperimeter wire length (HPWL) estimation for minimally routed two and three port
29
nets gives an exact value .The sum of all HPWL, i.e., all nets, gives a value of the
approximate interconnect wire length for a placement solution. Another cost
function parameter widely used is component overlap; as design rules do not
allow components to overlap each other, any instance of such should be
considered as a penalty .a penalty factor is added directly to the wiring length
approximation. A commonly used objective function is a weighted sum of area
and wire length.
cost = A + L, where A is the total area of the packing, L is
the total wire length, and and are constants.
The simplest way to generate a new placement is to move one random
component from one position to another random position while another fairly
simple change is to swap two random components positions. Changing a
components orientation will move its ports resulting in small changes to
interconnect lengths of the nets of which the component is a member. This type of
move is usually only performed when no other types of perturbations yield a new
solution. The key to applying Simulated Annealing to placement is the use of a
cooling schedule which the algorithm follows. As the algorithms greediness is
inversely proportional to the systems temperature, moves may be accepted that
actually allow an increase in a placements cost. Enough time must be spent in the
upper and lower temperatures to allow first a quick arrangement of the system and
a final localized arrangement, respectively. If too much time is spent in the upper
temperatures, processing time will be wasted as many inefficient intermediate
solutions will be accepted, if too much time is spent in the lower temperatures,
processing time will again be wasted due to a tight restriction on accepted moves.
The algorithm is generalized by Figure 3.9.
3.5.5 Simulated annealing algorithm
Algorithm Simulated Annealing
Begin
Temp = Initial_Temp
Current placement = Initial Placement
While (temp! = Final_Temp) do
30
While (no of iterations < max_iterations) do

{
New placement = Rand Move (current placement)
c = COST (new placement)-COST (old placement)
If (c < 0) then
Current placement = new placement
Else if (exp (-c/T) > Random (0, 1)) then
Current placement = new placement
Temp = Schedule (temp)
}
End
31
Figure 3.9 Simulated Annealing Placement Flow Chart
32
3.5.6 Cooling Schedule

Specifically, the cooling schedule can follow any function but it typically employs
two slopes, one steep slope for the extreme high and low temperatures and a
smaller slope for the intermediate temperatures where the most beneficial
placement changes will be made. As stated above, changes resulting in a reduced
cost will always be accepted but that is not to say that changes resulting in an
elevated cost will always be rejected. Temperature has an effect on the probability
of a cost inducing change being accepted with the specific form exp(-c/T)
where c is the positive cost change due to the new placement and T is the
current system temperature given by the cooling schedule. This function is
combined with a random value generator, if the randomly generated value is
greater than the temperature function result the new placement is accepted. It is
easy to see that for very large temperatures almost any change will be accepted
while as the temperature is reduced the chance that a positive cost change will also
be accepted is reduced. Each temperature step may contain several placement
perturbations in the algorithm, adjusting this number is one of the refinements that
may be made to deliver a more computationally efficient placement.
One of the most important part is the proper definition of the cooling
schedule in order to maximize the cost reduction for each temperature step. If too
much time is spent in upper temperatures, placement attempts are wasted due to
an inordinate amount of cost increasing moves being accepted. At this point in the
temperature schedule the design can be thought of as liquid and placement is
simply randomizing the initial solution. If the design is cooled too quickly, the
algorithm tends to get trapped in local minima that would otherwise be avoided if
the design was allowed to cool slowly. This allows cost increasing moves to be
accepted, moving the design to new points potentially allowing previously
unavailable cost reducing moves. If the design is cooled too slowly, each
temperature step will reach a point where no further cost reductions are seen; the
algorithm converges on a cost which is then maintained by the combination of
acceptance of cost increasing moves and discovering cost reducing moves.
33
Initial temperature, Cooling schedule, and freezing point are usually

experimentally determined some common cooling schedules are t = t, where is
typically around 0.95 and t = e-t, where is typically around 0.7
3.5.7 Advantages and Disadvantages
The simulated annealing algorithm is one of the most established
algorithms for placement problems. It produces good quality
floorplan.However;
simulated
annealing
is
computationally
expensive and can lead to longer runtimes. Therefore, it is only

suitable for small to medium size circuits. Although it is proven to
converge to the optimum, it converges in infinite time. Not only

for this reason, but also since we have to cool down slowly, the
algorithm is usually not faster than its cotemporaries.
34
Chapter 4
TCGS: Combination of TCG and SP
4.1 Introduction
The equivalence of the two most promising P*-admissible representations, TCG
and SP, and integrate TCG with a packing sequence (part of SP) into a new
representation, called TCGS.TCG-S combines the advantages of SP and TCG and
at the same time eliminates their disadvantages. With the property of SP, faster
packing and perturbation schemes are possible. Inherited nice properties from
TCG, the geometric relations among modules are transparent to TCGS (implying
faster convergence to a desired solution), placement with position constraints
becomes much easier, and incremental update for cost evaluation can be realized.
These nice properties make TCG-S a superior representation which exhibits an
elegant solution structure to facilitate the search for a desired floorplan/placement.
TCG-S results in the best area utilization, wirelength optimization, convergence
speed, and stability among existing works and is very flexible in handling
placement with special constraints.
4.2 P*-admissible and non-P*-admissible representations

A representation is said to be P-admissible if it satisfies the following four
conditions (1) the solution space is finite, (2) every solution is feasible, (3)
packing and cost evaluation can be performed in polynomial time, and (4) the best
evaluated packing in the space corresponds to an optimal placement. Extension of
the P-admissible representation to that of P*admissible one is done by adding the
fifth condition: (5) the geometric relation between each pair of modules is defined
in the representation. With this condition, general floorplans/placements can be
modeled. Therefore, a P*-admissible representation contains a complete structure
for searching for an optimal floorplan/placement solution. Therefore, it is
desirable to develop an effective and flexible P*admissible representation.
35
Among the existing popular representations, SP, BSG, and TCG are
P*admissible while slicing tree, NPE, O-tree, B*-tree, CBL, and Qsequence are
not. The slicing tree and NPE are intended for slicing floorplans only. Since an
optimal placement could be a non-slicing structure, the two representations are not
P-admissible and thus not P*-admissible (i.e., violation of P*-admissibility
Condition (4)). An O-tree defines only one-dimensional geometrical relation
between compacted modules and thus can obtain the relation in the other
dimension only after packing (i.e., violation of Condition (5)). A B*-tree requires
a placement to be left and/or bottom compacted. However, the space intended for
placing a module may be occupied by previously placed modules during packing,
resulting in a mismatch between the original representation and its compacted
placement. Therefore, it may not be feasible to find a compacted placement
corresponding to the original B*-tree, and thus it is not P-admissible (i.e.,
violation of Condition (2)). CBL and Q-sequence can represent only mosaic
floorplans, in which each region in the floorplan contains exactly one module.
CBL and Q-sequence are not P-admissible because it cannot guarantee a feasible
solution after a perturbation (i.e., violation of Conditions (2) and (4)).
4.3 Combining TCG and SP

Both SP and TCG are considered very flexible representations and construct
constraint graphs to evaluate their packing cost. SP consists of two sequences of
modules (+, -) where + specifies the module ordering from top-left to bottomright and - corresponds to the ordering from bottom-left to top-right. This can be
used to guide module packing. However, like most existing representations (e.g.,
NPE, BSG, O-tree,B*-tree, CBL, Q-sequence),the geometric relations between
modules are not transparent to the operations of SP (i.e., the effect of an operation
on the change of module relation is not clear before packing), and thus we need to
construct constraint graphs from scratch after each perturbation to evaluate the
packing cost; this deficiency makes SP harder to converge to a desired solution
and to handle placement with constraints (e.g., boundary modules, pre-placed
modules, etc)
36
TCG consists of a horizontal transitive closure graph Ch to define the

horizontal geometric relations between modules and a vertical one Cv for vertical
geometric relations. Contrast to SP, the geometric relations between modules is
transparent to TCG as well as its operations, facilitating the convergence to a
desired solution. Further, TCG supports incremental update during operations and
keeps the information of boundary modules as well as the shapes and the relative
positions of modules in the representation. Nevertheless, like SP, constraint graphs
are also needed for TCG to evaluate its packing cost, and unlike SP, we need to
perform extra operations to obtain the module packing sequence.
4.5 Problem Definition

Let B = (b1, b2, b3.) a set of rectangular modules whose width, height, and
area are denoted by wi, hi, ai. A placement P is an assignment of blocks such that no
two modules overlap. The goal of floorplanning/placement is to optimize a
predefined cost metric such as a combination of the area (i.e., the minimum
bounding rectangle of P) and/or the wirelength (i.e., the summation of half
bounding box of interconnections) induced by the assignment of bis on the chip.
37
4.6 Equivalence of SP and TCG

We can transform between TCG and SP as follows: Let the fan-in(fan-out) of a
node ni ,denoted by fin(ni) ( fout (ni) ) be the nodes njs with edges ( nj , ni )((ni ,
nj)). Given a TCG, we can obtain a sequence - by repeatedly extracting a node n i
with fin(ni) =0 in Cv and fout (ni)=0 in Ch ,then deleting the edges ( nj , ni )' s ((ni ,
nj)s) from Ch (Cv) until no node is left in Ch (Cv) . Similarly, we can transform a
TCG into another sequence - by repeatedly extracting the node n i with fin(ni) =0
in both Ch and Cv and then deleting the edges ( nj , ni )s from both Ch and Cv until
no node is left in Ch and Cv . Given an SP (+, -), we can obtain a unique TCG
(Ch, Cv).
4.7 Comparison between TCG and SP

- Of an SP corresponds to the ordering for packing modules to the bottom-left
direction and thus can be used for guiding module packing. However, like most
existing representations, the geometric relations among modules are not
transparent to the operations of SP (i.e., the effect of an operation on the change of
module relation is not clear before packing), and thus we need to construct
constraint graphs from scratch after each perturbation to evaluate the packing cost;
this deficiency makes SP harder to converge to a desired solution and to handle
placement with constraints (e.g., boundary modules, pre-placed modules, etc).
Contrast to SP, the geometric relations among modules is transparent to TCG as
well as its operations, facilitating the convergence to a desired solution. Further,
TCG supports incremental update during operations and keeps the information of
boundary modules as well as the shapes and the relative positions of modules in
the representation. Unlike SP, nevertheless, we need to perform extra operations to
obtain the module packing sequence and an additional time to find a special type
of edges, called reduction edges, in Ch (Cv).for some operations.
For both SP and TCG, the packing scheme by applying the longest path
algorithm is time-consuming since all edges in the constraint graphs are
processed, even though they are not on the longest path. if we add a source with
38
zero weight and connect it to those nodes with zero in-degree, the x coordinate of
each module can be obtained by applying the longest path algorithm on the
resulting directed acyclic graph. Therefore, we have xg = max (xa, xb, xc, xd xe)
However, if we place modules based on the sequence - and maintain a horizontal
and a vertical contours, denoted by RH and RV respectively, for the placed
modules, the number of nodes need to be considered can be reduced. Let RH (RV )
be a list of modules bi s for which there exists no module bj with y j yi ' ( xj xi '
) and xj ' xi ' ( yj ' yi ' ) .Suppose we have packed the modules a ,b,c,d,e based
on the sequence - then, the resulting horizontal contour will be R H < c, e, d >.
Keeping RH we only need to traverse the contour from e, the successor of c, to the
last module e, which have horizontal relations with g. Thus, we have xg = xd .
Packing modules in this way, we only need to consider xe and xd and can get rid
of the computation for a maximum value, leading to a faster packing scheme.
4.8 The TCG-S Representation

Combining TCG (Ch Cv ) and SP (- ) , we develop a representation called TCGS = (Ch ,Cv ,- ), which uses a horizontal and a vertical transitive closure graphs as
well as a packing sequence - to represent a placement..
4.8.1 From a placement to TCG-S
In this section, it is shown how to construct (Ch,Cv) and - from a placement. first
- is extracted from a placement, and then constructs C h and Cv according to -.
For two non-overlapped modules bi and bj, bi is said to be horizontally
(Vertically) related to bj, denoted by bi bj (bi bj), if bi is on the left (bottom)
side of bj and their projections on the y (x) axis overlap. (two modules cannot
have both horizontal and vertical relations unless they overlap.) For two nonoverlapped modules bi and bj, bi is said to be diagonally related to bj if bi is on
the left side of bj and their projections on the x and the y axes do not overlap. In a
placement, every two modules must bear one of the three relations: horizontal
relation, vertical relation, and diagonal relation. To simplify the operations on
geometric relations, we treat a diagonal relation for modules bi and bj as a
39
horizontal one, unless there exists a chain of vertical relations from bi (bj),
followed by the modules enclosed with the rectangle defined by the two closest
corners of bi and bj, and finally to bj (bi), for which we make bi bj, (bj bi).
(a) A placement
(b) The corresponding TCG-S of (a).

Fig 4.1: a placement to TCG-S
4.8.2 Sequence Pair Representation from a placement

Given a placement, - can be extracted based on the procedure
described. A sequence pair imposes a horizontal/vertical (H/V)
constraint for every pair of modules as follows
(ab,ab)=>a should be placed to the left of b
(ba,ab)=>a should be placed below b.
After extracting - we can construct Ch and Cv based on -, For each
module bi in - , we introduce a node ni with the weight being bis width
(height) in Ch( Cv ) Also, for each module bi before bj we introduce an edge ( n j
, ni ) in Ch ( Cv ) if bi bj (bi bj) As shown in Figure 4.1 .
40
4.8.3 From TCG-S to a placement

The basic idea is to process the modules based on the sequence defined in - , and
then pack the current module to a corner formed by two previously placed
modules in RH (RV) according to the geometric relations defined in C h and Cv.We
can keep the modules bis in RH (RV) in a balanced binary search tree in TH (TV) in
the increasing order according to their right (top) boundaries. For easier
presentation, we add a dummy module bS (bt) to RH (RV) to denote the left (bottom)
boundary module of a placement. We have bs bi ( bt bi ).Let (xs ,ys(= )0, )
and (xt ,yt(= ) ,0). RH (RV ) consists of bS (bt ) initially, and so does the
corresponding TH (TV ).To pack a module bj in - ,we traverse the modules bks in
TH (TV )from its root, and go to the right child if bk bj ( bk bj ). And the left
child if bk bj ( bk bj).The process is repeated for the newly encountered
module until a leaf node is met.
Then bj is connected to the leaf node, and xj = xp( yj = yp), where p is the
last module with bp bj (bp bj) in the path. After bj is inserted into T H ( TV ),
every successor bl with xl < xj
(yl < yj )in TH ( TV ) is deleted since bl is no
longer in the contour. The ordering of nodes in TH (TV) can be obtained by depthfirst search. This process repeats for all modules in -.We have W = x v ( H= y v )
if bv is the module in the resulting T H (TV) with the largest value, where W (H)
denotes the width (height) of the placement.
To pack the first module ba in -, we traverse TH ( TV ), from the root bS (bt
) and insert it to the right child of b S ( bt ) since bs ba ( bt ba ).Therefore, the
first module ba in - is placed at the bottom-left corner i.e.( xa , ya )=(0,0 ) since
bS (bt ) is the last module that is horizontally (vertically) related to ba and xs 0 = ,yt
0 = ( fig 4.2 (a) a balanced binary search tree after ba is inserted into T H ( TV )).
Similarly to pack the second module bb in - , we traverse TH from the root bs
and then its right child since bs ba. Then bb is inserted to the left child of since
bs bb.Because bs is the last module with bs bb in the path, xb = xs0 = .
Similarly, we traverse TV from the root bt and then its right child ba since ba
bt .Then bb is inserted to the right child of ba in TV since ba bb.Therefore, yb =
ya = 1.5 because ba is the last module with ba bb in the path. The resulting
balanced binary search trees after performing tree rotations TH ', TV ' is shown in
41
Figure 4.2 (b). As shown in Figure 4.2(c), after bc is inserted, bb in T H is deleted

since bb is the successor of bc and xb < xc .The resulting TH ' and TV ' are shown
in Figure 4.2(c).The process is repeated for all modules in - .
Fig 4.2: The packing scheme for the TCG-S
4.9 Floorplanning Algorithm

A simulated annealing based algorithm is used for solution perturbation. Given an
initial TCG-S, the algorithm perturbs the TCG-S into a new TCG-S to find a
desired solution.
4.9.1 Reduction Edge
An edge (ni, nj) is said to be a reduction edge if there does not exist another path
from ni to nj, except the edge (ni, nj) itself. Otherwise it is a closure edge, for some
operations. In Figure 4.1 for example edges (na, ng), (nd, ng), and (ne, ng), are
reduction edges while (nb, ng) and (nc, ng), are closure ones. With -
we can
find a set of reduction edges, provides a significant improvement in time than

using TCG alone.
Given an arbitrary node ni in a transitive closure graph Ch( Cv ),we can find
all the nodes njs that form reduction edges (ni , nj)s using a Linear Scan algorithm
as follows.
42
4.9.1.1 Linear Scan algorithm:

i) First we extract from - those nodes n j s in fan
out
(ni ) of Ch( Cv ) and keep
their original ordering in - .

ii) Let the resulting sequence be S .The first node nk in Sk and ni must form a
reduction edge (ni, nj).
iii) Then, we continue to traverse S until a node nl with (n k, nl) not in Ch (Cv) is
encountered. (ni, nl) must also be a reduction edge.
iv) Starting from nl we continue the same process until no node is left in S.
As an example shown in Ch of Figure 4.1, we are to extract all reduction edges
emanating from nC. We first find S =< nd, ne, ng, nf > by extracting nodes in fan out
(nc) based on the sequence in - .nC and the first node nd in S form a reduction
edge (nc, nd). Traversing S, we have another reduction edge (nc, ne). Since edge (nd,
ne). Is not in Ch .Starting from ne, we search the next node n with (ne, n), not in
Ch.We find node nf, implying that (nc, nf) is also a reduction edge. Therefore we
have found all reduction edges emanating from nC: (nc, nd), (nc, ne), (nc, nf).
4.9.2 Solution Perturbation
The four operations Rotation, Swap, Reverse, and Move are used to perturb Ch
and Cv. During each perturbation, we must maintain the three feasibility properties
for Ch and Cv.Unlike the Rotation operation, Swap, Reverse, and Move may
change the configurations of Ch and Cv and thus their properties.Further, we also
need to maintain - to conform to the topological ordering of the new Ch and Cv.
(i) Rotation
To Rotate a module bi, we exchange the weights of the corresponding node ni in
Ch and Cv. Since the configurations of Ch and Cv do not change, so dose - . Figure
4.3 shows the resulting TCG-S after rotating the module g .
43
Fig 4.3: The resulting TCG-S after rotating the module g
(ii) Swap
Swapping ni and nj does not change the topologies of Ch and Cv, except that nodes
ni and nj in both Ch and Cv are exchanged. Therefore we only need to exchange bi
and bj in - .Figure 4.4 shows the resulting TCG-S after swapping the nodes n c
and ng shown in Figure 4.3.The modules bc and bg in - are exchanged.
Fig 4.4: The resulting TCG-S after swapping the nodes nc and ng
44
(iii) Reverse
Reverse changes the geometric relation between bi and bj from bi bj (bi bj) to
bj bi (bj bi).To reverse a reduction edge (n i , nj) in one transitive closure graph,
we first delete the edge (ni , nj)from the graph, and then add the edge (n j , ni )to the
graph. To keep Ch and Cv graph feasible, for each node nk fin (nj ) U { nj }and nl
fout (ni) U { ni } in the new graph, we have to keep the edge (nk , nl) in the graph.
If the edge does not exist in the graph, we add the edge to the graph and delete the
corresponding edge (nk, nl) (or (nl, nk)) in the other graph. To make - conform to
the topological ordering of the new C h and Cv, we delete bi from - and insert bi
after bj. For each module bk between bi and bj in - , we shall check whether the
edge (ni, nk) exists in the same graph. We delete bk from - and insert it after the
most recently inserted module.
Figure 4.5 shows the resulting TCG-S after reversing the reduction edge
(nd, ne) of the Cv in Figure 4.4. Since there exists no module between bd and be in
- , we only need to delete bd from - and insert it after be, and the resulting - is
shown in Figure 4.5.
Figure 4.5.The resulting TCG-S after reversing the reduction edge (nd, ne)
45
(iv) Move
Move changes the geometric relation between bi bj (bi bj) to bj bi (bj bi)
to move a reduction edge (ni, nj) from a transitive closure graph G to the other G,
we delete the edge from G and then add it to G. Similar to Reverse, for each node
node nk fin (ni ) U { ni }and nl fout (nj) U { nj } in G , we must move the edge
(nk, nl) to G if the the corresponding edge (nk, nl) (or (nl, nk) ) is in G . Since the
operation changes only the edges in Ch or Cv but not the topological Ordering
among nodes, - remains unchanged. Figure 4.6 shows the resulting TCG-S after
moving the reduction edge (na, ne) from Ch to Cv in Figure 4.5. Notice that the
resulting - is the same as that in Figure 4.6.
Figure 4.6: The resulting TCG-S after moving the reduction edge (na, ne) from
Ch to Cv
4.10 Placement with Constraints

4.10.1 TCG-S with boundary constraints
The placement with boundary constraints is to place a set of prespecified modules
along the designated boundaries of a chip.
46
Boundary Constraint: Given a boundary module bi, it must be placed in one of the
four sides: on the left, on the right, at the bottom or at the top in a chip in the final
packing. If a module bi is placed along the left (right) boundary, the
in-degree (out-degree) of the node ni in Ch equals zero. If a module is placed along
the bottom (top) boundary, the in-degree (out-degree) of ni in Cv equals zero. For
each perturbation, we can guarantee a feasible placement by checking whether the
conditions of boundary modules are satisfied.
4.11 Advantages Of TCG-S

The orthogonal combination TCG and SP, TCG-S = (C h, C v, and -) leads to a
representation with at least the following advantages:
With the property of SP, faster packing
And perturbation schemes are possible for a P*-admissible representation
Inherited from TCG, the geometric relations among modules are

transparent to TCG-S, implying faster convergence to a desired solution.
Inherited from TCG, placement with position constraints becomes much

easier.
TCG-S can support incremental update for cost evaluation.
47
Chapter 5
Software Implementation
5.1 Software Implementation Phases
In order to study and characterize the floorpaln representation TCG -S when
applied to floorplan optimization, a C++ language program / Linux was
implemented. The program uses the floorplan bench mark circuits file formats as
its input. The software implementation of the SA placement algorithm went
through several phases as given below.
PHASE 1 : Placement to TCG-S
i) TCG-S: TCG + SP
ii) TCG-S: HCG & VCG from initial floor plan
PHASE 2: Solution Perturbation using simulated annealing algorithm.
i) Rotation
ii) Swap
ii) Reverse
iv) Move
PHASE 3 : TCG-S to Placement
i) HCG & VCG to floor plan.
5.2 Experimental Code

This work began with the definition of the data structures to be used. As a design
consists of a set of soft and hard rectangular blocks of specified area and aspect
ration constraints, structures representing these elements are defined and
populated with data required to manage the floorplan algorithm. An overview of
initial organization is given below.
48
i) Structure which defines rectangular blocks in a floorplan bench mark
circuit.
Components of the structure are block name, type (hard or soft), area, xcord,
ycord, leng, width, minimum and maximum aspect ratios.
ii) Structure which defines nodes in a graph (gerez structure for directed acyclic
graphs).Components of the structure are node name, ch weight, cv weight, longest
path, xcord, ycord, Struct edge *outgoing, indegree, out degree.
iii) Structure which defines edges in a graph components of the structure are
edge index, from, to, next.
An array of these data structures is maintained to hold number of blocks in
a floorplan bench mark circuit. First part is reading the bench mark circuit file and
storing the informations into the data structures defined.
5.3 Benchmarks
In order to provide an algorithm characterization test set the GSRC/MCNC
benchmarks are employed. The benchmark sizes range from 33 to 200k
components across 18 circuits. In the simplest formulation of the block packing
problem, all the blocks are rectangle hard blocks with fixed heights and widths,
and their locations are free to assign within the whole layout region. However, in
real design, the block packing algorithm may have to deal with blocks with more
complex features:
Softblocks
In the early stage of physical design many of the circuit blocks are not yet
designed and are thus flexible (soft) in shape. For these soft blocks, the
block packing algorithm not only needs to determine their locations, but
also needs to assign specific shapes to them.
Rectilinearblocks
as some of the circuit blocks come from design re-use, their shapes are not
necessarily rectangle.Therefore the block packing algorithm should be able
to handle the arbitrarily shaped rectilinear blocks.
Pre-placedblocks
in some cases, the locations of some blocks may be fixed, or a region may
49
be specified for their placement.For example, in high performance chips,

the clock buffer may have to be located in the center of the chip to reduce
the time difference between the clock signal arrival times at different
blocks. The block packing algorithm should put these pre-placed blocks on
their specified location or region without overlapping with other blocks
GSRC benchmark circuits Ami33, Ami49, Apte, Xerox, Hp, N200, and N300 used
to test the software implementation.
5.3.1.Theblocksformat
The blocks file specifies the name and other optional information about each
node. After the standard header, the format specifies:
NumSoftRectangularBlocks: number of soft rectangular block nodes
NumHardRectilinearBlocks: number of hard rectilinear block nodes
NumTerminals: number of terminal (pad etc.) nodes
Then, one line for each soft rectangular block node with format as follows:
Node Name soft rectangular area minAspectRatio maxAspectRatioblock/terminal
node in the floorplan. Each line specifies a single block/terminal
5.4 Logical Modules
In order to analyze the SA algorithm to characterize its behavior it is necessary to
logically modularize the implemented software.Each module is represented by
number of function calls in the program.
5.4.1 Design of width and length for given area of block

Width and lengths are designed for each soft block of given area, satisfying
minimum and maximum aspect ratio constraints.
50
5.4.2 Building an initial floorplan from the blocks of benchmark circuit.

For constructing an initial random floorplan from blocks of bench mark circuit,
first blocks are sorted in the ascending order of length and arranged rowwise.x
coordinates and y coordinates of block is saved in array of floorplan data
structure.
5.4.3 Extracting - (sequence pair representation) from the floorplan.
- Of sequence pair representation can be obtained by using the method explained
in section 3.8.1.first a tree is constructed from placement .a depth first search
traversal will give - and a breadth first traversal will give -.
5.4.4 Construction of Horizontal constraint and Vertical constraint graph
HCG and VCG constructed from initial floorplan according to - (sp), as
explained in section 3.8
5.4.5 Finding reduction edges.
Set of reduction edges is found for performing solution perturbations.
5.4.6 Floorplaning algorithm using simulated annealing.
Initially we have to decide:
i) The state space
ii) The neighborhood structure
iii) The cost function
iv) The initial state
v) The initial temperature
vi) The cooling schedule (how to change t)
vii) The freezing point
Then Solution perturbations are performed on reduction edges by four operations
swap, move, rotation and reverse.
51
5.4.6.1 Design Perturbation

The perturbation function is the combination of four different operations
specifically, move, rotate, reverse, and swap.
5.4.6.2 Move Acceptance
This module is simple in concept but represents the essence of SA, the
unconditional acceptance of cost decreasing moves and the possibility of
accepting cost increasing moves based on the current system temperature. The
change of the designs cost is generated by the combination of the previous two
modules and represents the factor by which the move(s) will be analyzed for
acceptance. Any negative change in cost (i.e., a good move) is unconditionally
accepted while a positive change in cost is not unconditionally rejected. If the
random value is less than the calculated factor, the move is accepted regardless of
its cost. It is easily seen that during the high temperature phase of placement,
moves that increase the designs cost are readily accepted. As the temperature is
reduced, the size of the cost increase that will have a certain chance of being
accepted is also reduced. This gives that large increases will be less likely to be
accepted beginning at higher temperatures while small increases will experience
this at lower temperatures. If the move is not accepted under the above two
circumstances it is rejected and any changes that have been made are Undone.
5.4.6.3 Design Update
After the decision has been made to either reject or accept the move, the design
must be then updated accordingly as some change to the design has been made
during the move attempt. If it is decided that the move is to be accepted, all that is
required to be performed is to check the overlap list for and remove components
which no longer exhibit overlap. If the move is to be rejected, the design needs to
be reset as if the move was never attempted. This involves not only moving the
component(s) back to their original position(s), but also iterating through the list
of modified nets and resetting their cost, iterating through the list of overlapping
components and resetting any changes made to their overlap status and finally
resetting the designs overlap cost which is recorded through iterations. In both
52
cases the list of nets to be updated is deleted as this point represents the end of the
move attempt.
5.4.6.4 Temperature Schedule

The temperature schedule greatly influences the behavior and quality of the
placement; it deserves a place distinct from the rest of the algorithm. The
temperature is degraded by two factors determined by the user of the program.
One factor is used at the outer regions of the schedule while the other is used in
the middle region. Typically, the temperature degradation factors and the borders
which determine their use are set such that a high temperature zone is quickly
transverse to a point where moves begin to reduce the overall cost of the design.
From this point the temperature is slowly degraded until a point where moves are
mostly rejected, where again, the temperature is quickly degraded as in the high
temperature region.
5.4.6.5 Construction of horizontal and vertical contours RH (RV) and packing
the module according to - In TH (TV)
RH (RV) is a list of modules bi s for which there exists no module bj with yj
yi'(xj xi' ) and xj ' xi'(yj' yi). We can keep the modules bis in R H (RV) in a
balanced binary search tree TH (TV) in the increasing order according to their right
(top) boundaries.
53
CHAPTER 6
RESULTS
6.1 Organization
This chapter is organized to present the experimental results of software
implementation of floorplan representation TCG-S, and analys its performance
with other representations. Based on a simulated annealing method the TCG-S
representation has been implemented in the C++ programming language on Linux
platform. Based on the five commonly used MCNC and GSRC benchmark
circuits . The results summarizes the area utilization of the the two Floorplan
Representations TCG and TCG-S.
6.2 TCG
Circuit Name
No. of Blocks
Area Utilization
(%)
Apte
Hp
Xerox
Ami33
Ami49
N10
N30
N50
N200
N300
9
11
10
33
49
10
30
50
200
300
79
80
77
85
86
82
83
86
87
90
Table 6.1 Summarization of area utilization for various bench mark

circuits-TCG
54
6.3 Sample Output

Circuit Name: n200 (MCNC Benchmark)
Number of Modules: 200
55

6.4 TCG-S
Circuit Name
No. of Blocks
Area
Utilization
(%)
Apte
9
79
Hp
11
80
Xerox
10
78
Ami33
33
85
Ami49
49
87
N10
10
84
N30
30
83
N50
50
88
N200
200
88
N300
300
91
Table 6.2 Summarization of area utilization for various bench
mark circuits TCGS
6.5 Sample output

Circuit Name: Apte (MCNC Benchmark)
56
Circuit Name: Xerox (MCNC Benchmark)

Number of modules: 10
57
Circuit Name: hp (MCNC Benchmark)

Circuit Name: n10 (GSRC Benchmark)

58

Circuit Name: ami33 (MCNC Benchmark)

59
60
Circuit Name: ami49 (MCNC Benchmark)


61

62

63
CHAPTER 7
CONCLUSIONS
This work began by implementing the floorplan representations TCG (transitive
closure graph) and SP (sequence pair) and then combining these to representations
a new representation TCG-S is developed. Simulated Annealing algorithm has
been designed for solution perturbation. An optimal solution with respect to
component placement and minimization of floorplan area is obtained. This step
allowed insight to the intricacies of the Simulated Annealing algorithm, leading to
a novel approach to model a floorplan. Area optimization as a primary parameter,
experiments based on a set of commonly used MCNC benchmarks show that
TCG-S results in the best area utilization and convergence speed.
64
CHAPTER 8
FUTURE DEVELOPMENTS
The presented annealing-based optimization considers only area, however, this
work can extend to other objectives like wirelength minimization and placement
with Pre-placed Modules.
7.1 Wirelength Minimization:
Global interconnect is commonly recognized as a key factor for designing highperformance integrated circuits, as VLSI process technology migrates into deep
submicron (DSM) dimensions and operates in giga-hertz clock frequencies. By
using a wide range of interconnect synthesis and optimization techniques, such as
topology optimization, buffer insertion, layer assignment, wire sizing, and wire
spacing, the performance of a global interconnect could be improved by a factor
of 5 or more. As the global interconnects are largely determined by floorplanning,
it becomes critical for floorplanning engines to be able to handle efficient
interconnect planning and optimizations, so that the overall timing and design
convergence can be achieved.
7.2 TCG-S with Pre-placed Modules
The placement with pre-placed modules is to place a set of prespecified modules
at the designated locations of a chip.
Pre-placed Constraint: Given a module bi with a fixed coordinate (xi,yi)
and an orientation, bi must be placed at the designated location with the same
orientation in the final packing.
65
BIBLIOGRAPHY
[1] J.-M. Lin and Y.-W. Chang, ``TCG-S: Orthogonal Coupling of P*-admissible
Representations for General Floorplans," IEEE Trans. Computer-Aided Design,
Vol. 24, No. 6, pp. 968--980, June 2004.
[2] Jai-Ming Lin and Yao-Wen, TCG: A Transitive Closure Graph-Based
Representation for General Floorplans, IEEE Transactions on Very Large Scale
Integration Systems, 2003.
[5] P.-N. Guo, C.-K. Cheng, and T. Yoshimura, An O-Tree representation of nonslicing floorplan and its applications,Proc.DAC, pp. 268273, 1999.
[6] Sadiq M. Sait and Habib Youssef, VLSI Physical Design Automation- Theory
and Practice, IEEE Press.
[7] Naveed Sherwani, Algorithms for VLSI Physical Design Automation,
Kluwer Academic publishers, 1995
[8] Floorplanning in VLSI avalable online: www.manchester.ac.uk.
[9]design optimization avalable online:www.me.mtu.edu
[10] VLSI Design Automation avalable online: www.mountains.ece.umn.edu
[11] Floorplanning by Dinesh Bhatia available online: www.dallas.edu
[12] Floorplanning by Professor Lei He available online: www.ee.ucla.edu
[13] Recent Development in VLSI Floorplan Representations available online:
www.ee.nthu.edu.
[14] CAD for VLSI Simulated Annealing Algorithm available online:
www.ece.wisc.edu
[15] Physical Design. Floorplan available online: www.sharif.edu
[16] VLSI Placement avalable online: www.ee.ucla.edu
[17] VLSI CAD avalable online: www.vlsicad.ucsd.edu
[18] A Thesis Submitted Modeling of a Hardware VLSI Placement System:
Accelerating,
the
Simulated
Annealing
www.ritdml.rit.edu
66
Algorithm
available
online:

9 Report 1

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

9 Report 1

Caricato da

Copyright:

Formati disponibili

Chapter 1

floorplanning/placement with various constraints. To cope with these challenges it

1.2 ORGANIZATION OF THE REPORT

much as possible. Similarly, a good layout will have as few vias as

2.2 Physical Design Cycle

required is referred to as a netlist. The output of partitioning is a set

2.2.2 Floorplanning and Placement

Fig.2.1 VLSI Physical design cycle

interconnections between blocks according to the specified netlist.

which are to be used as a passageway for that wire. Global routing

2.2.5 Extraction and Verification:

Fig.2.2 Typical Floor Plan of a chip

arrange the blocks on a chip,

decide the location of the I/O pads,

decide the location and number of the power pads,

decide the type of power distribution, and

Decide the location and type of clock distribution.

2.3.2 Floorplan Problem Definition

b)half of the perimeter

Fig.2.3 Wire Length Estimation

2.4 Approaches to Floorplanning:

2.5 Floorplan structures

b. Non Slicing floorplan

Fig.2.4 Floorplan structures

2.6 Floorplan Representation

2.6.1.1 Slicing Tree

(a) Slicing floorplan

(b) slicing tree

Fig.2.5 Slicing floorplan

Fig.2.6 Polish Expression

SP1= (ABCDFE, FADEBC), SP2= (ABCDFE, FADBEC)

Fig.2.8 Bounded-Sliceline Grid

Fig.2.9 O-Tree and placement

admissible placement p in a recursive fashion. We make n 0 the root of tree since

Fig.2.10 B*-Tree and placement

2.6.2.5 Corner Block List (CBL)

Fig.2.11 CBL and placement

3.2 From a Placement to Its TCG

b) Horizontal closure graph c) Vertical closure graph

3.3 From a TCG to its placement

Fig.3.2 Augmented TCG.

Error (positive cycle);

3.4 Floorplanning Algorithm

Fig 3.3(a) initial configuration of TCG

Fig 3.3 (b) rotate module d

Fig 3.4(a) TCG before swap

Fig 3.4 (b) swap na and nb

Fig 3.5(a) TCG before reverse

Fig 3.5 (b) reverse (nc, ne)

Fig 3.6(a) TCG before move

Fig 3.6(b) move (nb, ne)

3.5 Simulated Annealing physical model

Figure 3.7 Molecules Movements per Temperature Region

Figure 3.8: Series of Neighboring Solutions Containing a Local Minimum

3.5.2 Application to Combinatorial Optimization

While (no of iterations < max_iterations) do

Figure 3.9 Simulated Annealing Placement Flow Chart

3.5.6 Cooling Schedule

Initial temperature, Cooling schedule, and freezing point are usually

expensive and can lead to longer runtimes. Therefore, it is only

converge to the optimum, it converges in infinite time. Not only

4.2 P*-admissible and non-P*-admissible representations

4.3 Combining TCG and SP

4.2 P-admissible and non-P-admissible representations