Sei sulla pagina 1di 31

SYNOPSYS, INC.

700 East Middlefield Road, Mountain View, CA 94043 USA


Phone: 650-584-4200, OR: 1-800-245-8005

PrimeTime® Clock Reconvergence Pessimism


Removal (CRPR) Application Note

Version 1.4

History
5th March 2003 : Initial version
16th September 2003 : Revised with the addition of the following sections
Why is PTs CRPR calculation best?
CRPR & Multiplexed Clocks
25th Nov : CPU & Memory Performance-
2nd Dec : Revised to add / modify
Advantages of PT’s CRP calculation over other methods
CRPR examples of finding critical paths
Minimum threshold settings

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 1
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

1 Glossary of Acronyms
The following abbreviations are used in this article.
CRP :Clock Reconvergence Pessimism
CRPR :Clock Reconvergence Pessimism Removal
SI :Signal Integrity

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 2
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

2 Introduction
The following are the main points of this application note. Each point is explained in more detail in the
following article.

What is CRPR?
An explanation of what Clock Reconvergence Pessimism is and how it is
calculated.
Advantages of PT’s CRP calculation over other methods
PrimeTime’s original implementation of CRPR was of the “Path based” type. This
technique has inherent limitations. This section explains the relative merits of path
based CRP calculations versus graph based approaches, and how it can miss critical
timing paths. All versions of PrimeTime after 2001.08 utilize a graph based CRP
solution.
CRPR Command and variable overview
An overview of all the applicable CRPR commands and their usage.
Report_timing & report_crpr may show slightly different CRP values.
The value of CRP reported by report_timing may be less than the value of CRP
reported by report_crpr, with a lower bound set by the CRPR variable <var name>
(see variables and commands below).
CRPR & Latches.
CRPR & SI analysis.
An explanation of CRPR calculation with SI analysis.
CRPR & IR drop.
An explanation of how CRPR works with the multi-voltage flow in PrimeTime T-
2002.09 is given, including an explanation of how CRP removal will automatically
and accurately account for delay and slew differences due to annotated cell rail
voltage values in the CRP calculation, as part of the IR drop flow.
CRPR & Multiplexed Clocks.
The support of related and combined clocking structures is explained for the V-2003.12 release

CPU & Memory Performance.


Performance and memory improvements are explained and discussed for the V2003.12 release

Understanding calculation pessimism introduced by the CRPR threshold


Performance and memory improvements are explained and discussed for the V2003.12 release

Known Issues

References

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 3
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

3 What is CRPR ?
CRPR is the removal of artificially induced pessimism from a timing report between a launching and
capturing device. If the same clock drives both devices, then the launching and capturing paths will share a
common sub path before branching. We refer to this sub path as the common portion. CRP itself is the
difference along this common portion of the clock tree, between the minimum and maximum arrival times at
the common point, of the clock signal. The common point is defined as the output pin of the last cell in the
common portion on the clock tree. (See Figure 1 below).

CRPR is mainly applicable in on-chip variation mode, where the worst possible timing variation may occur
throughout the chip. It may also be present in single operating condition, or best case / worst case analysis, as
it is an STA effect that is circuit topology dependent. However, timing variation seen in the clock network
will not be as severe in these modes, and hence the resulting CRP value will not be as significant to the paths
of interest in the analysis.

The following table illustrates the variety of timing data in “on-chip variation mode”.

Operating SETUP(MAX) CHECK HOLD (MIN) CHECK


Condition Launch Capture Data Launch Capture Data
BEST CASE Late Early Late Early Late Early
WORST CASE Late Early Late Early Late Early

Table 1: Variety of timing data in on-chip variation mode

The entries in italics signify the selection made by on-chip variation mode for static calculation purposes.

The terms “late” and “early” in the table have the following meanings:

Late(max) ......................... The latest possible time for data to either leave a pin, or for data to arrive at a pin.
This is also referred to as the “max” path, where “max”, in this context refers to
delay alone, and is not to be confused with the MAX (setup check) type.

Early(Min)........................ The earliest possible time for data to either leave a pin or to arrive at a pin This is
also referred to as the “min” path, where “min”, in this context refers to delay
alone, and is not to be confused with the MIN (hold check) type.

As can be observed from the table:

A setup check consists of the latest possible data launch time, combined with the earliest possible capture
time and the longest possible (max)delay on the data path.

A hold check consists of the earliest possible data launch time, the latest possible capture time and the
shortest possible (min) data path.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 4
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

This is illustrated in the diagram below for a setup check.


Late(max)
Launching Capturing
Device Device
Latest Data OUT
Launch Path
Path

FF1 FF2

CLK
CLK
U1 U2 U3

Latest Latest Earliest


Earliest Earliest Earliest
Capture
Path
Common point
CLK (Output pin of
Last common cell)

Figure 1: CRPR definition for setup check


As is observed in the diagram above, where we are considering a setup check, we have a common portion in
the clock network.

During STA the setup timing report calculation is constructed from the launching clock path, data path and
the capturing clock path. The launching clock path and data path both consider LATE signal propagation
times, whilst the capturing clock path considers the EARLY signal propagation time.

In a physical design, however, the cells along the common portion of the clock tree cannot simultaneously
achieve their maximum and minimum delay values. Thus there will be a single value of delay to the common
point that will be propagated to both the launching and capturing devices. This conflicts with STA since we
utilise two sets of delay values at the common point.

Therefore our timing report contains artificially introduced pessimism that is derived from our usage of
EARLY and LATE arrival times for the launching and capturing paths along this common portion of the
clock network. The value of this pessimism, is the difference between the EARLY and LATE arrival times at
the common point in the clock network.

Hence it is valid to remove, from the final slack calculation, the pessimism artificially introduced during the
slack calculation.

Clock Reconvergence Pessimism (CRP)

CRP = Latest Arrival time@common point – Earliest arrival time time@common point

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 5
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

The situation is identical for hold checks, since we are using the EARLY values for the launch path and the
LATE values for the capture path. Therefore the CRP value calculated at the common point will be identical
to that calculated for a setup check.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 6
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

4 Advantages of PT’s CRP calculation over other


methods
4.1 Methods of calculating CRP
In computing a CRP value that is to be utilized in a slack calculation 2 possible methods exist.

1. Path based
a. Advantage : For small critical path sets, relatively quick
b. Disadvantage : By default, non exhaustive CRP analysis
- can cause critical paths to be missed
c. Disadvantage : Exhaustive analysis is infeasible
d. Disadvantage : Has inherent limitations in support of certain structures
e. Disadvantage : Cannot be used for sign-off
2. Graph based
a. Advantage : Exhaustive CRP analysis
- will not miss critical paths
b. Advantage : Can support arbitrarily complex structures
c. Disadvantage : Requires more memory and CPU than path based

4.2 History
Prior to release 2001.08 of PrimeTime, the computation of CRP was performed using path based methods.
Starting with the release 2001.08 the calculation method was modified to be graph based.

4.3 Why did PrimeTime change it’s approach ?


PrimeTime changed it’s approach from 1 to 2 (as shown above) because;

• Critical paths can be missed


• Support of diverse and complex clocking structures is very difficult
• Extending path based methods to give exhaustive CRP analysis is unworkable. Runtime and
memory usage would be much higher than in the graph based approach

Equivalent to: report_timing –nworst ∞ -max_paths ∞

4.4 What is the difference between the 2 methods ?

Path based methods evaluate CRP in the following way


1. Find the critical path set with CRP off
2. Post-process this set ONLY to calculate the CRP by considering the common portion of the clock
network.

Therefore the number of paths to consider in the calculation is the size of the path set specified in 1.

In a path based approaches, the runtime and memory usage is a function of the critical path set.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 7
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

CPU / MEM usage = f (critical _ path _ set )

If the critical path set is small, then so will be the runtime and memory usage.

Graph based methods evaluate CRP in the following way


When update_timing is called
1. Exhaustively process the entire design calculating slack considering the CRP for every path
endpoint in the design.
2. Generate the path critical path set .

In graph based approaches, at the most general level, the runtime and memory usage is a function of the
design size, however the analysis mode and the threshold setting also significantly affect performance.

CPU / MEM usage = f (Design _ Size, CRP _ Threshold , Analysis _ Mode )

If the design has a large number of instances then this will lead to a large runtime and memory usage

In addition if the clock network is in the pre-layout stage, then this will also cause a very large runtime.
Note that in this cases CRP analysis in general is invalid also.

Typically runtime and memory usage with CRPR ON is expected to be approximately 2-3X the runtime for
standard PrimeTime, depending upon processing conditions and design style..

4.5 What effect does the CRPR threshold have ?

The CRPR threshold value is a means of controlling runtime and memory usage without sacrificing too much
accuracy.

The threshold value itself causes groups of latches to be formed that have the same CRP value within a
certain range (the threshold value). This allows many efficiencies to be made in data processing.

Setting a very low CRP threshold guarantees a long runtime and high memory usage, as no processing
efficiencies can be made. Every instance in the design must be processed individually.

In general, very good accuracy can be achieved by setting the value to 5 -10 Picoseconds. Setting the value
lower than 1ps does not help the accuracy of the calculation and causes excessive runtime and memory usage.

In future releases of PrimeTime, the threshold minimum setting will be limited.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 8
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

4.6 How would a path based method miss a critical path ?


The effect of CRP on the slack calculation is the same for setup and hold checks in both methods. The
overall effect is to reduce the number of violating paths for both setup and hold(i.e. increase slack values).

Consider the case below where the signal arrival times of path 1 and path 2 are very close, but path 1 is worse
than path2. In the example we are only considering the setup check, but the same reasoning applies for hold
checking also.

The correct result for the above example is

If CRP is OFF then path1 is the path giving the worst endpoint slack at the capturing Flip-Flop

If CRPR is ON and if we consider both paths together then path 2 is the path giving the worst endpoint slack
at the capturing Flip-Flop

A sample calculation is as follows.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 9
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

With CRPR OFF if we execute a command of the form

pt_shell> report_timing –nworst 1 –maxpaths 10,000


(in other words, list the worst endpoint slack out of up to 10,000 paths terminating at that endpoint)

Path1 is reported correctly as the worst path

However, when CRPR is ON the results from the 2 methods of CRP removal will differ

Path Based
Path 1 will still be reported (incorrectly) as the worst path. This is because the method relies on gathering the
critical path set and then post-processing to recalculate the path slacks considering CRP.

If path 2 is not returned, as it would not be in the above case, then the path based method would not find the
critical path

The only way to get path based methods to catch this type of condition is for the entire design is to have every
path reported and processed as a critical path. This is obviously not a feasible approach.

i.e. report_timing –nworst ∞ -max_paths ∞


Graph based
Since the entire design is considered in the calculation of CRP, Path 2 will be correctly reported as the worst
path.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 10
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

4.7 When should I run with CRPR on ?

PrimeTime’s graph based CRPR approach is highly optimized for post–layout sign-off evaluation. If a pre-
layout clock tree is present, then the algorithm’s compute mode will be highly sub-optimal, leading to very
large runtime and memory usage.

Path based CRPR approaches do not suffer from this problem, as they are not dealing with very large data
sets as in graph based methods.

Note also that the CRP value computed in a pre-layout scenario will be meaningless in the context of the
slack calculation. This is because the clock signal arrival times at the latch clock pins will not realistically
represent the arrival times of the post-layout clock tree.

We therefore recommend that CRPR be turned OFF for pre-layout analysis runs.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 11
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

5 CRPR command & variable overview


Variable values are given in a list, with the default separated from the value by a colon ( : ). Variables are
prefixed with the text ‘var’, commands are prefixed with the text ‘cmd’. Each command and variable is
explained in more detail below.

CRPR has 4 variables & 1 command. They are;

• var : timing_remove_clock_reconvergence_pessimism {true , false: default}


• var : timing_clock_reconvergence_pessimism {normal: default, same_transition }
• var : timing_crpr_threshold_ps {max = hardware dependent; 20ps:default ; min = 2e-5ps }
• var: crpr_consider_level_sensitive_edge {both ; default , open , close}
(2002.09 releases only)
• cmd: report_crpr
• cmd: report_timing

5.1 var: timing_remove_clock_reconvergence_pessimism {true ,


false:default}
Action : Turns CRPR on and off

5.2 var: timing_ clock_reconvergence_pessimism {normal:default,


same_transition }
Action; as per table 1 below, the values in the table body are the values that the CRP calculation uses.

Timing_clock_Reconvergence_pessimism
Transition type normal Same_transition
@common point
Rise crp_rise crp_rise

Fall crp_fall crp_fall

Mismatch Min of crp_rise & Zero


crp_fall

Where “mismatch” means that there is more than 1 transition type (i.e. rising & falling) required at the
common point to drive the launching and capturing registers.

If we take the case where both the launching and capturing devices are triggered on the same edge of the
clock signal, this means that during propagation one of the paths between the common point and either the
launching or capturing register experiences an inversion

Alternatively a mismatch can occur where neither of the paths from the common point to the launching and
capturing devices has experienced an inversion, but the devices themselves are activated by different edges of
the clock signal.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 12
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

5.3 var: timing_crpr_threshold_ps {max = hardware dependent;


20ps:default ; min = 2e-5ps }

5.3.1 Action
Determines how much pessimism that the CRP value used in report timing can leave in the report. See the
explanation below of differences between report_crpr / report_timing CRP values for more details. Note that
this variable can have an exponential effect on runtime. Setting a large value in this variable will (in some
cases) considerably speed up the CRP calculation in update_timing, but will lead to a corresponding loss of
accuracy. Hence this mechanism provides the opportunity for the user to trade-off runtime against accuracy
of the calculated CRP value.

The effect of this variable is to reduce or increase the computational cost of the CRP calculation, as alluded to
above. The value of this variable determines the level of common point1 compression (i.e. merging) where
the value of the CRP threshold calculated for an adjacent common point is less than the specified value of
timing_crpr_threshold_ps.

This means that there are a set of points in the clock network that are removed from the computation, where
the value of CRP calculated is not more than the specified threshold value. Hence for the report of interest,
the value of CRP in report_timing may differ from the actual amount of CRP (as will be reported by
report_crpr – see section entitled “report_timing & report_crpr may show slightly different values” below) by
the threshold value. i.e.

CRP in report_timing ∈ range[ (actual CRP - timing_crpr_threshold_ps), (actual CRP)]

In the case of SI analysis, this variable plays a more crucial role in determining the complexity of the CRP
calculation, since, there are 2 sets of arrival times under consideration; delta free & delta inclusive arrival
times.

When comparing the difference in CRP between 2 adjacent points, both CRP values are checked, one
calculated from crosstalk delta free arrivals, the other from crosstalk delta inclusive arrivals.

In addition, depending upon the value of the variable and the values of the SI delta delays considered, the
level of common point compression in the CRP calculation with SI on may well be smaller than with SI off.
This will lead to correspondingly higher memory and runtime requirements with SI on. This will result in a
difference between the values of CRP calculated with SI analysis turned on and SI analysis turned off.
However, both values are guaranteed to be within the specified threshold value.

Please refer to the section titled “CRPR & SI analysis for more details on this topic.

Having read the above, the astute reader will have surmised that setting the threshold to a very small value
impacts runtime and memory very significantly. However, in utilizing the threshold variable for setup and
hold analysis, different values may be used for the relative runs.

For hold checks, setting the threshold value to a minimal value will give a better quality analysis, since a
more accurate value of CRP will be used in assessing the slack for latch to latch paths that are temporally
extremely close. For example, if two registers are very close (temporally closer than the CRP threshold then
they are considered to be the same point as far as the CRP analysis goes. Reducing the threshold value gives
greater accuracy for these critical path analysis. In this case, the setup check is not of concern.

1
The common point is defined as the output pin of the last cell in the common portion of the clock network.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 13
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

For setup checks, the problem occurs on longer paths. In this case a larger value of threshold could be used
without affecting the result too much, since it is highly unlikely that a pair of registers in the critical path set
of interest will be temporally closer that the CRP threshold. In this case, hold check is not of concern.

It is of course up to the user to determine the value of the threshold for each type of run.

Additionally, indiscriminate usage of this variable can have a serious effect on CPU runtime and memory
usage. Please see the section entitled “CPU & Memory Performance” for more details

5.4 var: crpr_consider_level_sensitive_edge {both ; default , open ,


close}
IMPORTANT NOTE THIS VARIABLE APPLIES TO 2002.09 RELEASE ONLY
5.4.1 Action
This variable introduces user selection of the clock edge that will be used in the calculation of CRP for
transparent devices at which time borrowing is occurring. Time borrowing occurs when data arrives between
the opening and closing edges of a transparent latch. This variable does not affect the CRP calculation at
transparent devices unless they are actually borrowing.

This variable allows the user to select which edge to select as a basis for the CRP calculation by allowing the
user to select either the opening edge, the closing edge or (to avoid potential optimism) the edge that leads to
the more pessimistic value of CRP.

Crpr_consider_level_sensitive_edge behavior
crpr_consider_level_sensitive_edge
timing_clock_reconvergance_pessimism open close both
Normal device device Min of
opening closing crp_rise, and
edge edge crp_fall
(rise/fall) (rise/fall)
Same Transition device device zero
opening closing
edge edge
(rise/fall) (rise/fall)

5.5 cmd: report_crpr (from T-2002.09)

Action; Reports details of the CRPR calculation, in the form of a launch, capture register pair.

Report CRPR for the U-2003.03 release has been modified to include additional information on the CRP
calculation itself, and provides options to view the complete path to the common point from the clock source,
as well as accounting for derated arrival times.

Please refer to the PrimeTime man page for more detailed information on this report.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 14
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

5.6 cmd: report_timing


Action; The CRPR value is reported by report_timing as part of the slack calculation when the variable
timing_remove_clock_reconvergence_pessimism is set to TRUE. Depending upon the setting of the
variable timing_crpr_threshold_ps, and the path reported upon, the value of CRP reported by report_timing
may vary with respect to the value reported by report_crpr. See section 6 for more details.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 15
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

6 report_timing & report_crpr may show slightly


different CRP values.
The difference arises due to a variation in the calculation mechanisms of the value of CRP for report_timing
& report_crpr.

• The value of CRP displayed by report_timing may be less than the value of CRP displayed by
report_crpr. It is likely that the computed values utilized by report_crpr & report_timing will be
different for certain paths in the design.

Report_crpr will always report the ACTUAL CRP value. Its calculation is based upon processing a single -
from -to pair. Hence this is not computationally expensive, since it can only ever be between a pair of
sequential device clock pins. It is therefore the most accurate way to measure CRP in a design.

The CRP value utilized by report timing must be computed by update_timing, which involves (in many cases)
significant computational effort, as it must calculate CRP values over the entire design. This requires
building a complete picture of the entire clock network & assessing the CRP values for all startpoints and
endpoints that are required for reporting. The reporting sets themselves may be very large.

In order to reduce the computational cost, we use the variable timing_crpr_threshold_ps (default = 20ps) to
help reduce the size of the computation. This introduces common point2 compression (i.e. merging) where
the value of the CRP threshold calculated for an adjacent common point is less than the specified value of
timing_crpr_threshold_ps.

This means that there are a set of points in the clock network that are removed from the computation, where
the value of CRP calculated is not more than the specified threshold value. Hence, for the report of interest
the value of CRP in report_timing may vary from the actual amount of CRP by the threshold value. i.e.

CRP in report_timing ∈ range[ (actual CRP - timing_crpr_threshold_ps), (actual CRP)]

Therefore the CRP value produced by report_timing and the CRP value produced by report_crpr will differ
by a value not greater than the timing_crpr_threshold_ps value.

In order for the CRP values in report_timing & report_crpr to agree for every path in the design, no two
adjacent common points in the clock network should have a CRP value differing by less than the value set on
the variable timing_crpr_threshold_ps.

Setting timing_crpr_threshold_ps to the minimum value is not recommended since the additional runtime
expense is rarely justified in the face of alternate sources of greater inaccuracy within the timing analysis
itself. The selection of the value for timing_crpr_threshold_ps value should be based on a consideration of
the level of inaccuracy acceptable to the user.

2
The common point is defined as the output pin of the last cell in the common portion of the clock network.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 16
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

7 CRPR & Latches


From U-2003.03 onwards a more precise scheme for transparent latch handling has been introduced, the
details of this mechanism may be referenced in the following application note “Transparent Latch
Enhancements, application note for PrimeTime 2003.03”.

In PrimeTime T-2002.09 only, the edge of a level sensitive device may be specified to calculate the CRP
value. Refer to section III “CRPR command & variable overview” and the section titled
”var:crpr_consider_level_sensitive_edge” for more information on the behavior in the 2002.09 release.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 17
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

8 CRPR & SI analysis


In the case of Signal Integrity (SI) analysis, coupling capacitors between networks are additionally considered
in delay calculation, where a victim network has its delay dynamically affected by the switching
characteristics of the aggressor network. These are further referred to as ‘delta’ delays (? ).

Victim

As stated above, in considering SI analysis, delta delays, are calculated from aggressor networks. It is only
valid to factor delta delays into the CRPR analysis under the following condition.
Aggressor switching must affect both the launching and capturing signal at the same time.
Note that under some circumstances a difference may be seen between the CRP results with SI on and SI off.
This is caused by the threshold value (timing_crpr_threshold_ps) being at too high a setting. This is
explained further in section III CRPR command and variable overview, in the subsection describing the
variable timing_crpr_threshold_ps.
Aggressor switching must affect both the launching and capturing signal at the same time

If the launch and capture signals for a particular path are a clock edge apart, and delta delays crucially depend
upon the temporal relationship between victim and aggressor, then in reality, the delta delays affecting the
launching and capturing signals will be different.
Indeed, different aggressors, switching in different ways may well affect the launching and capturing signals.
This will lead to different delta delay values that will either speed up or slow down the victim network. How
the victim is affected is therefore entirely dependent upon the aggressor switching cycle. This effect can be
observed in the figure below.
Vh

Vl
t

Vh
Victim

Vl Launch Capture
t
Delayed Edge Unaffected
Vh
Victim~
Vl
t
min? max?
PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 18
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

As a further example, let us consider the diagram below, where we have a victim (clk) and an aggressor
signal “agg1”. Let us say that there is another signal in the aggressor set, that we shall call “agg2”, and
additionally that during clock cycle ‘n’ aggressor1 is active and aggr2 is stable, and that at clock cycle ‘n+1’
that the situation is reversed, that is, agg1 is stable and agg2 is active. Then, the delta delay given to the
victim network during clock cycle n, will be different from the delta delay given to the victim at clock cycle
n+1. Obviously, the forgoing is extensible, as depicted by the signal agg3, which is influencing the same
network during the next clock cycle
agg3
agg1

?3
?1 C3=20
C1=10
clk U1 U2
?2
C2=20
Common point
agg2

agg3

agg2

agg1

clk
?1 ?2 ?3

Launch Capture Launch

PrimeTime & PrimeTime SI (U-2003.03 onwards) has been enhanced to ensure that the dynamic properties
of SI delta delays are considered correctly in the CRP calculation.
In PrimeTime T-2002.09 and earlier, delta delays are handled in the same way as regular delays in the CRPR
analysis. This implementation does not account for the dynamic nature of delta delays. It is therefore not
recommended to use CRPR with SI, for sign-off analysis, in this and previous releases.
In PrimeTime U-2003.03 and later the fact that delta delays are dynamic is accounted for correctly. For a
given timing check, any CRP arising due to delta delays can only be removed provided that precisely the
same clock edge drives both the launching and capturing devices. Such checks are broadly classified as zero
cycle checks.
The zero cycle behaviour itself may be intentional or accidental. In all cases, PrimeTime considers the delta
delays from SI analysis as part of the CRP calculation.

This is a valid approach because (as per the previous section) in order for delta delays from SI analysis the
following must hold;

Ø Aggressor switching must affect both the launching and capturing signal at the same time.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 19
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

Zero cycle behavior occurs most frequently in hold timing checks, however there are a subset of corner cases
in which it also applies. The following (non-exhaustive) list gives examples of these circuit topologies /
checks.

1. In standard hold checks, as mentioned above, and the hold check corner cases mentioned in the
subsections below.
a. Where there is feedback from the output of a register back to its input (intentional or
otherwise).
i. Intentional feedback would be a direct feedback path from !Q (inverse of Q) to the
data input (D) of a latch, as in a generated divide by 2 clock for instance.
ii. Unintentional feedback would be via crosstalk coupling that may couple the output
of a sequential device back to it’s own input.
b. Where clock skew is employed to drive both the launching and capturing clock signals. In
this case, it is necessary to set a multicycle path of 0 along the data path of interest.
2. In certain setup checks where transparent latches are involve or where a multicycle constraint of zero
has been set on the data path.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 20
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

9 CRPR & IR Drop


IR drop is a dynamic effect, for which we are able to annotate instantaneous values of maximum and
minimum rail voltage. The value of cell delay is calculated taking into account the value of annotated (or
default) rail value. No enabling or setup is required to ensure that this is the case.

Since the CRP calculation for crp_rise and crp_fall is the difference between the maximum and minimum
path delays along the common portion of the clock network, which itself consists of the cumulative values of
network and cell delays, then the difference in cell delays due to the annotated values of voltage are
accounted for in the max and min paths, via normal data processing.

Launching Capturing
Device Device
Data
Path OUT

V1 V2
max/min max/min FF1 FF2

CLK U1 U2
CL

VSS = 0v

Common point

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 21
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

10 CRPR & Support of Multiplexed Clocks


In releases of PrimeTime prior to V-2003.12, the handling of combined related clocks is not handled.

In PrimeTime V-2003.12 and subsequent releases the CRPR algorithm has been made more accurate for the
case where two or more related clocks are combined in a complex fashion.

Two clock signals are considered related if one is derived from the other. An example is a single clock source
which is used to directly derive another clock signal of half the original frequency via a divide-by-2 circuit.
This new clock is called a “related generated clock.”

In previous releases of PrimeTime, the CRPR algorithm ignored related generated clocks in its determination
of the common node. This caused the algorithm to consider only the most pessimistic case for this type of
clocking situation. Where multiplexors or other logical clock manipulation circuitry was used, it was possible
to use case analysis to force the algorithm to explicitly consider related generated clocks.

This is illustrated in the figure below

In the V-2003.12 release of PrimeTime, you can still use case analysis to specify the exact conditions for
CRPR analysis. However, the CRPR algorithm now explicitly considers related generated clocks in its
determination of the common node, even in the absence of case analysis.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 22
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

Additionally, In the V-2003.12 release, the report_crpr command was modified slightly to support the
analysis of related generated clocks. The option -clock clock_name are replaced by two new options,

-from_clock clock_name
-to_clock clock_name.

The two options specify the names of the clocks that fan out to the “from” and “to” latches, respectively.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 23
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

11 CPU & Memory performance


In releases of PrimeTime prior to V-2003.12, performance and CPU usage have been significant bottlenecks
in the usage of CRPR as a sign-off capability. These issues stem from 2 main sources.

• Usage of extremely low CRPR threshold values;


Setting the threshold variable to a very low value (generally less than 1ps) caused significant
amounts of additional data to be produced within PrimeTime. This caused an exponential increase in
CPU runtime and memory usage requirements. The net effect is that some designs would not
complete in 32bit architectures, whereas others suffered from excessive runtime & memory usage.

• Growth in the usage of extremely complex clock networks having many generated / related clocks
and clocking subsystems.
In order to implement the growing requirement for faster and/or low power devices, chip designers
have created a wide variety of complex clocking schemes that utilize many generated and related
clocks, multiplexors, pulsed clocking schemas etc. Additionally there is also an increasing trend of
using a variety of clock tree synthesis mechanisms and methodologies that produce a gamut of
clocking structures that have not previously encountered.

The above 2 points conspired to create significant requirement to improve both CPU runtime and Memory
utilization, as well as enhanced infrastructure.

In V-2003.12 release of PrimeTime and beyond, significant improvements have been made to the algorithmic
engine used in processing the CRPR calculation. This modification, in conjunction with significant
infrastructure enhancements have resulted in significantly improved algorithmic performance of this feature.

Additionally, the sensitivity of the timing_crpr_threshold_ps with respect to CPU runtime and memory usage
has been significantly affected. This is clearly shown in the tabular performance data shown below, which
for both Table1 and Table 2 gives performance comparison data against the U-2003.03 release of PrimeTime.
For example, where the figure 3.62 appears in the top left hand corner of Table 1, this means that there was a
3.62X scalar performance increasing in CPU for a threshold setting of 50ps. In table 2 where the figure 2.72
appears in the top left hand corner of the table, this means that memory usage was 2.72 times less in U-
2003.12, when compared against V-2003.12 for a threshold setting of 50ps.

Table 1 : CPU Gain in V-2003.12 Vs U-2003.03


CRPR threshold (Picoseconds)
Design 50 20 10 5 2 1 0.2
s156664 3.62 14.01 13.91 13.91 13.86 13.86 8.43
s160471 3.12 6.52 6.54 6.63 5.59 2.29 2.20
s136512 6.77 14.57 13.75 15.02 13.37 13.07 14.54
s123351 2.36 4.16 2.08 1.94 2.27 2.41 20.42
s162512 2.25 1.55 3.38 3.18 3.19 3.10 2.76
s134557 4.29 7.52 9.28 7.15 6.14 5.19 5.17
Average: 3.74 8.06 8.16 7.97 7.40 6.65 8.92

Table 2 : Memory Gain in V-2003.12 Vs U-2003.03


CRPR Threshold Picoseconds)
Design 50 20 10 5 2 1 0.2

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 24
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

s156664 2.72 19.90 19.81 19.83 19.79 19.79 12.70


s160471 2.74 8.06 8.05 8.04 7.17 3.12 2.98
s136512 2.62 5.69 3.80 4.05 4.64 5.02 9.08
s123351 1.45 5.63 4.16 3.15 4.49 4.39 11.57
s162512 1.50 1.28 3.00 3.11 4.06 4.10 3.69
s134557 5.12 8.02 10.19 8.00 7.45 6.64 6.70
Average: 2.69 8.10 8.17 7.70 7.93 7.18 7.79

It can clearly be seen from the data above that ;

Peak Gain
CPU : 20.42X
Mem : 19.8X
Average Gain
CPU : 3.7 - 8.1X
Mem : 2.6 - 8.1X

Users should note that how the new algorithm performs is highly design style dependent. This is shown in
the following graphs, for the STARS 156664 and 123351. It should be noted that setting the CRP value close
to zero will give a scalar gain of 1X (no gain) in all cases.

V2003.12 :Performance: Star 156664

25
Gain CPU
Gain Mem
20

15
Gain

10

0
50 20 10 5 2 1 0.5 0.2
timing_crpr_threshold_ps

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 25
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

V2003.12 : Performance : Star 123351

25

Gain CPU
20 Gain Memory

15
Gain

10

0
50 20 10 5 2 1 0.5 0.2
timing_crpr_threshold_ps

It can be clearly seen from the above illustrations that how a particular design performs with a particular
threshold setting is entirely dependent upon the design style itself.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 26
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

12 Understanding calculation pessimism introduced


by the CRPR threshold
12.1 History
Prior to the 2003.12 release, the variable timing_crpr_threshold_ps minimum value was set at 2e-5 pico
seconds. In the 2003.12 release, the minimum threshold value was modified to a value of 1ps.

12.2 What changed in CRPR for 2003.12 ?

There were significant modifications to the CRP processing algorithm in the 2003.12 release. These
modifications resulted in significant improvement in runtime and memory, as well as quality.

In these modifications, the way that the threshold variable is used to drive the processing algorithm was
modified.

Tests have shown that setting a very low minimum value for the threshold setting gives a very small gain in
accuracy, but gives a very bad algorithmic processing condition.

The processing condition that results from a low threshold setting is the same condition that results from
setting CRPR to TRUE with a pre-synthesized clock network.

The default value of the CRPR threshold is set at 20 Pico seconds. This is a meaningful numerical value of
design in the geometry range 90 to 180microns. The value of the default threshold is reviewed every release
to decide if it is still applicable to the design domain. The value of the default threshold represents a
proportion of the typical clock tree stage delay for this technology range, and considers other sources of error
in the analysis process. Some of these sources of error are elaborated upon below.

12.3 Why select 1ps as the minimum threshold value?

Consider a design of clock frequency of 500Mhz in a .13 micron process


1 Pico second is .05% or 1/2000th of the clock period.

The timing_crpr_threshold_ps value means that there may be some pessimism in the CRP value reported by
report_timing.

The amount of this pessimism, will be:


Minimum 0ps, Maximum 1ps.

Recall that report_crpr can be used to report the exact value of CRPR in every case.

In the analysis environment, there are many sources of uncertainty and conservatism, that are far greater than
the error introduced by the CRPR threshold value itself. A designer will typically need to consider a far
greater margin than 1 pico second in order to determine the stability of the design.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 27
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

For example, clock jitter is one such property that a designer must consider. Clock latency is also figure that
is subject to some error value, as is the datapath delay value.

Static Timing Analysis typically operates in an environment having a numerical accuracy that is 0 to 3% of
SPICE. We can show that the error introduced by the CRP threshold value into the slack calculation is far
less than other sources of error inherent to STA. This can be achieved with a simple calculation.

For example

Consider the following design scenarios, where we examine the datapath of a fast design, with a critical (or
near critical) datapath in min and max analysis (as in a setup check). We also consider a tight maximum
calculation error with respect to SPICE, in order to make the analysis realistic.

Clock Speed : 500Mhz design,


Clock period : 2 nano seconds
CRPR threshold 1 pico second
Maximum STA Versus SPICE calculation error : 3%
SPICE simulated maximum delay along a datapath : 1.9 nano seconds
SPICE simulated minimum delay along a datapath : 1.2 nano seconds

12.4 Max calculation


Normal STA Error = SPICE datapath timing * STA V SPICE calculation error
= 1.9ns * .03
= .057ns (57 ps)

This is the maximum total STA error we would see in the slack calculation with respect to SPICE for this
datapath, if we performed an exact calculation of CRP with report_crpr

Total STA Error = Normal STA Error + maximum CRP threshold error
= .057 + .001
= .058ns

This is the maximum total STA error we would see in the slack calculation with respect to SPICE for this
datapath if we performed a CRP calculation with report_timing. Note that the minimum total STA error is
the same in both cases, as the minimum CRP threshold error is 0 .

% of the maximum possible error in slack calculation due to


CRPR threshold : .051%
Normal STA Error: 3%
Total STA Error: 3.051%

Min calculation

Normal STA Error = SPICE datapath timing * STA V SPICE calculation error
= 1.2ns * .03
= .036ns (36 ps)

This is the maximum total STA error we would see in the slack calculation with respect to SPICE for this
datapath, if we performed an exact calculation of CRP with report_crpr

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 28
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

Total STA Error = Normal STA Error + maximum CRP threshold error
= .036 + .001
= .037ns

This is the maximum total STA error we would see in the slack calculation with respect to SPICE for this
datapath if we performed a CRP calculation with report_timing. Note that the minimum total STA error is
the same in both cases, as the minimum CRP threshold error is 0 .

% of the maximum possible error in slack calculation due to


CRPR threshold : .027%
Normal STA Error: 3%
Total STA Error: 3.027%

Hence we can state that for this datapath, the error range introduced into the CRP calculation in considering
minimum and maximum datapath analysis is;

Range [ .027% (minpath analysis) :.051% (maxpath analysis) ]

In both types of analysis, the inherent STA error that we see (3% in this case) in the datapath is far greater
than the maximum error introduced by the minimum CRPR threshold (1 Pico second). The only way to
remove the error from the datapath calculation is to perform transient (SPICE) simulation.

Error due to imprecise delay calculation will always be presents in STA, and there is no current known
method of removing it, other than performing transient simulation. So it is unrealistic that in any real design
analysis scenario, the error in the CRP calculation should determine the pass or failure of a particular path.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 29
timing_crpr_threshold_ps

SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

13 Known Issues / Limitations


As of the V-2003.12 release, all the issues discussed below are resolved. This is by virtue of a rework of the
internal infrastructure that supports CRPR. This infrastructure enhancement has resulted in a significant
reduction in the number of ongoing known issues.

The following are known issues listed by release, starting with the 2002.03 release.

Gated & Generated clocks ; In all releases up until T-2002.09, generated clock behavior under some
circumstances is incorrect. The problem occurs when under certain circumstances, an incorrect clock sense
propagation determination at the common point in the clock tree occurs.

Generated clocks : In all releases up to 2003.03, where a generated clock is declared on an inout port, then
the CRP value calculated will be incorrect.

All the above issues are fixed in the U-2003.03 release.

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 30
SYNOPSYS, INC.
700 East Middlefield Road, Mountain View, CA 94043 USA
Phone: 650-584-4200, OR: 1-800-245-8005

14 References;
1. Zejdaj, J., and Frain, P 2002 “General Framework for removal of clock network pessimism” Design
Automation Conference 2002 proceedings
2. Transparent Latch Enhancements, application note for PrimeTime 2003.03

PrimeTime Document
Proprietary Information-Not for distribution without Synopsys Approval 31

Potrebbero piacerti anche