Sei sulla pagina 1di 32

TCITSMCN40GGPMPLLA1_guide

True Circuits, Inc. General-Purpose PLL for TSMC CLN40G 40nm Guidelines and Specifications Revision 1.1 *** True Circuits, Inc. Confidential *** Copyright (C) 2000-2011 True Circuits, Inc. All rights reserved.

Purpose This document describes a General-Purpose PLL design in the TSMC CLN40G 40nm 0.9V IC process. This PLL design is a flexible core library macro that performs frequency synthesis within the noisy environment of large ASIC designs. Low-jitter operation is a primary concern in these large devices that often contain multiple phase-locked loops, RAM, and hundreds of high current / high speed output buffers. Due to the high frequency capability of the VCO at the minimum PVT corner, internal high performance dividers are included to eliminate false locking seen with external divider architectures. The PLL has dedicated analog VDDA and VSSA supply pads which are preferred for best jitter performance.

General-Purpose PLL The General-Purpose PLL is designed to multiply an input clock signal by an integer between 1 and 64. It also provides basic deskew functionality. It contains a 1-16 divider at the reference clock input, a 1-64 divider in the internal feedback path, and a 1-16 divider at the output. The output is 50% duty cycle for all output divider values. The -3dB bandwidth is adjustable over a factor of 4 range.

Contents This documents contains the following sections. It is highly recommended that the user read each section carefully. - General-Purpose PLL Specifications - Line Item Definitions of Specifications - PLL Default Settings - Modes Of Operation - PLL Output Frequency - Frequency Programming Calculation Script - PLL Bandwidth Adjustment - PLL Bypass Mode - PLL Test Mode - PLL Behavioral Modeling - PLL Feedback Path - PLL Feedback Delay - Minimizing Jitter - Timing Budgets - PLL Power-up and Reset Operation - Timing Relationships for PLL Digital Signal Inputs - PLL Cycle Slip Detection - Cascaded PLL Configurations - Chip Integration Issues - PLL Analog Supplies - Chip Layout Guidelines - Package and Board Guidelines - PLL Bench/Performance Testing Procedure - PLL Production Testing Guidelines - Additional Notes - Addendum to Line Item Definitions of Specifications - Definitions - Deskew PLL Application Notes

TCITSMCN40GGPMPLLA1_guide

General-Purpose PLL Specifications Performance Specifications - Divided reference frequency range 13.3MHz - 1.7GHz - /1 output frequency range 340MHz - 1.7GHz (VCO output internally divided by 2 for 50% DC) - Reference divider values 1-16 - Feedback divider values 1-64 - Output divider values 1-16 - /1 output multiples of div. reference 1-64 - Bandwidth adjustment div. range 1-64 - Feedback signal delay (max) NF/1.7GHz - Output duty cycle (nom, tol) 50%, +/-2% - Static phase error (max) +/-1.25% div. reference cycle - Period jitter (P-P) (max) +/-3% output cycle - Input-to-output jitter (P-P) (max) +/-1.5% div. reference cycle (jitter numbers are worst-case estimates with supply and substrate noise levels below -- actual results will be better) Power dissipation (nom) Reset pulse width (min) Reset /1 output frequency range Lock time (min allowed) (actual lock time will be much smaller) - Freq. overshoot (full-/half-) (max) - Area (including isolation) (max) Number of PLL supply pkg. pins Low freq. supply noise est. (P-P) (max) Low freq. sub. noise est. (P-P) (max) Ref. input jitter (long-term, P-P) (max) Reference/Feedback H/L pulse width (min) 2.5mA @ 850MHz (/1 output) 5us 10MHz - 100MHz 500 div. reference cycles 40%/50% 0.020mm^2 1 VDDA, 1 VSSA (preferred) 10% VDDA 10% VDDA 2% div. reference cycle 150ps TSMC CLN40G 40nm 0.9V, +/-10% 70C, -40C, 125C

- Process technology - Supply voltage (VDD, VDDA) (nom, tol) - Junction temperature (nom, min, max) Pin List - VDDA - VSSA - VDD - VSS - RCLK - FCLK - CLKOUT - CLKR[0:3] - CLKF[0:5] - CLKOD[0:3] - BWADJ[0:5] - RESET - PWRDN - INTFB n diagrams below) - BYPASS - TEST - RFSLIP - FBSLIP

Analog VDD Analog VSS Digital VDD (connected to core VDD) Digital VSS (connected to core VSS) Reference clock input Feedback clock input PLL clock output NR = CLKR[3:0] + 1, CLKR[0] is LSB NF = CLKF[5:0] + 1, CLKF[0] is LSB OD = CLKOD[3:0] + 1, CLKOD[0] is LSB Loop BW adj.: NB = BWADJ[5:0] + 1, BWADJ[0] is LSB Reset when high (also clears NR and NF counters) Power down when high Select internal feedback path when high rather than FCLK (not shown i Reference-to-output bypass when high Reference-to-counters-to-output bypass when high Reference cycle slip output (CLKOUT frequency high) Feedback cycle slip output (CLKOUT frequency low)

TCITSMCN40GGPMPLLA1_guide
Simplified Block Diagrams - Normal/BYPASS Mode (TEST=0) (The multiplexers for TEST mode are not shown.) +-----------------------------------------------------------+ | | | +-----+ +-----+ +-----+ | FCLK --|---->| |----->| | | | +-----+ 0|\ | | | /NF | | PFD | ... | VCO |----->| |-->| |---|-CLKOUT CLKF --|-/-->| | /-->| | | | | /OD | | | | | 6 +-----+ | +-----+ +-----+ /-->| | | | | | | | +-----+ | | | | | | 1| | | | /------------C------------------------C------------>| | | | | | | |/ | | | +-----+ | | ^ | RCLK --|-+-->| |--/ | | | | | /NR | | | | CLKR --|-/-->| | | | | | 4 +-----+ | | | | | | | CLKOD--|-/-------------------------------------/ | | | 4 | | | | | BWADJ--|-/-- ... | | | 6 | | | | | RESET--|-- ... | | | | | PWRDN--|-- ... | | | | | BYPASS-|------------------------------------------------------/ | | | TEST --|-- (=0) ... --|-RFSLIP | | VDDA --| ... --|-FBSLIP | | VSSA --| | | | VDD --| | | | VSS --| | +-----------------------------------------------------------+

TCITSMCN40GGPMPLLA1_guide
- TEST Mode (TEST=1) (When NR=1, NF=1, or OD=1, outputs do not toggle.) +-----------------------------------------------------------+ | | | | | | | +-----+ | FCLK --|-- ... /------>| | +-----+ | | | | /NF |----------------->| |---------|-CLKOUT CLKF --|-/--------/ /-->| | | /OD | | | 6 | +-----+ /-->| | | | | | +-----+ | | | | | | | | | | | | | | +-----+ | | | RCLK --|---->| |--/ | | | | /NR | | | CLKR --|-/-->| | | | | 4 +-----+ | | | | | CLKOD--|-/-------------------------------------/ | | 4 | | | BWADJ--|-/-- ... | | 6 | | | RESET--|-- ... | | | PWRDN--|-- ... | | | BYPASS-|-- ... | | | TEST --|-- (=1) ... --|-RFSLIP | | VDDA --| ... --|-FBSLIP | | VSSA --| | | | VDD --| | | | VSS --| | +-----------------------------------------------------------+

Line Item Definitions of Specifications - Divided reference frequency range: Allowed frequency range at the phase-frequency detector (PFD) input. - /1 output frequency range: Frequency range of the PLLs output clock when the output divider (OD) is set to /1. - Reference divider values: Allowed divider settings for the PLLs reference divider. - Feedback (integer or fractional) divider values: Allowed divider settings for the PLLs feed-back divider. - Output divider values: Allowed divider settings for the PLLs output divider. - /1 output multiples of div. reference: Provides feed-back multiply range.

TCITSMCN40GGPMPLLA1_guide
- Bandwidth adjustment div. range: Allowed closed-loop 3-dB bandwidth (Fbw_3dB) adjustment range. - Feedback signal delay (max): Maximum permitted delay in the feedback path from output clock pin of PLL to feed-back clock input pin of the PLL. This specification applies only to de-skew (DS) PLLs. - Output duty cycle (nom, tol): Duty cycle of the PLLs output clock. - Static phase error (max): Static phase error between a rising edge of the input reference clock and the corresponding edge of the PLLs feedback clock. - Period jitter (P-P) (max): The difference between maximum and minimum measured cycle times of the PLLs output clock. - Input-to-output jitter (P-P) (max): The difference between maximum and minimum measured offset between the input reference clock edge and the corresponding PLL output clock edge. - Power dissipation (nom): Total power (analog + digital) dissipated by the PLL when locked. - Reset pulse width (min): Minimum required pulse-width of the PLL reset signal. - Reset /1 output frequency range: Frequency range of the PLL output while PLL is in reset state and its output divider (OD) is set to /1. - Lock time (min allowed): Maximum number of divided reference clock cycles it will take the PLL to achieve phase/frequency lock once reset is de-asserted. - Freq. overshoot (full-/half-) (max): Maximum output frequency overshoot during lock acquisition as a percentage of the final, locked, PLL output frequency. "full" refers to the measurement of one full cycle (rising edge to following rising edge). "half" refers to the measurement of one half cycle (rising edge to following falling edge or falling edge to following rising edge, whichever happens to be smaller). - Area (including isolation) (max): Total layout area occupied by the PLL including built in isolation. - Number of PLL supply pkg. pins: Number of supply pins dedicated to PLL. - Low freq. supply noise est. (P-P) (max): Maximum allowed low-frequency peak-to-peak noise voltage on PLL supply pins as a percentage of nominal supply voltage. - Low freq. sub. noise est. (P-P) (max): Maximum allowed low-frequency peak-to-peak substrate noise voltage in the vicinity of the PLL as a percentage of nominal supply voltage. - Ref. input jitter (long-term, P-P) (max): Maximum allowed long-term jitter on the input reference clock. - Reference/Feedback H/L pulse width (min):

TCITSMCN40GGPMPLLA1_guide
Minimum allowed pulse-width (rising edge to following falling edge or falling edge to following rising edge, whichever happens to be smaller) on the input reference clock. - Process technology: Process technology in which PLL is developed and characterized. - Supply voltage (VDD, VDDA) (nom, tol): Supply voltage (digital and analog) ranges over which the PLLs performance specifications are defined. - Junction temperature (nom, min, max): Device junction temperature range over which the PLLs performance specifications are defined. * For additional details please refer in the "Addendum to Line Item Definitions of Specifications" section.

PLL Default Settings All PLL input control pins should be controllable through some sort of register. TCI does not recommended the hard-wiring of any control pins. If it is necessary to hard-wire any control pins TCI should be consulted for feedback. In addition, the following PLL control pins should be set at their default values. Default values: NB = (! INTFB)? (NF * OD * Next) : NF TCI should be consulted prior to programming the above controls to non-default settings.

TCITSMCN40GGPMPLLA1_guide
Modes Of Operation The PLL can operate in several modes: 1. Locked The positive edges of the PLL feedback and reference signals are phase aligned in normal operation. 2. Reset (RESET=1) The PLL outputs a fixed free-running frequency in the range of 10MHz to 100MHz for a divide by 1 output depending on the specific PLL type. 3. Power-down (PWRDN=1) All analog circuitry in the PLL is turned off so as to only dissipate leakage current. The digital dividers are not affected. 4. Bypass (BYPASS=1) The reference input is bypassed directly to the outputs. 5. Test (TEST=1) The reference input drives all dividers cascaded one after the other for production testing.

PLL Output Frequency The relationship between the PLL output frequency and the reference frequency in normal locked operation depends on the divider inputs. All divider inputs are binary encoded where an input of N typically causes the divider to divide by N+1. The output frequency Fout at CLKOUT is related to the reference frequency Fref by: Fout = Fref * NF / NR * Next when INTFB is low, and by: Fout = Fref * NF / NR / OD when INTFB is high. where Next is the total external feedback division and OD is in the feedback path (otherwise divide by OD in equation). Note that with INTFB low, the total division in the feedback path (NF * OD * Next), including both the internal feedback divider and any output/external feedback division, must be less than or equal to 64 under all operating conditions. This total feedback division limit is necessary in order to prevent the stability of the PLL from begin compromised. To compensate for any output/external feedback division, NB should be set to the total division in the feedback path (NF * OD * Next with INTFB=0 and NF with INTFB=1). The divided reference clock is the internal clock that follows the reference divider, and has a frequency Fref/NR. PLL counters: 1. CLKR: A 4-bit bus that selects the values 1-16 for the reference divider (NR) NR = CLKR[3:0] + 1 Example: /1 /4 /8 2.

pgm 0000 pgm 0011 pgm 0111

CLKF: A 6-bit bus that selects the values 1-64 for the multiplication factor (NF) NF = CLKF[5:0] + 1 Example: X1 X2 X64

pgm 000000 pgm 000001 pgm 111111

3.

CLKOD: A 4-bit bus that selects the values 1-16 for the post VCO

TCITSMCN40GGPMPLLA1_guide
divider (OD) OD = CLKOD[3:0] + 1 Example: /1 /4 /8 2.

pgm 0000 pgm 0011 pgm 0111

BWADJ: A 6-bit bus that selects the values 1-64 for the bandwidth divider (NB) NB = BWADJ[5:0] + 1 Example: /1 /4 /8

pgm 000000 pgm 000011 pgm 000111

Frequency Programming Calculation Script The program "TCITSMCN40GGPMPLLA1_calc.csh" can be used to calculate PLL frequency settings given input parameters. The script must be run on UNIX systems and have access to the C compiler. The script contains a simple C program that will be automatically compiled upon execution. The script does perform range checking and should return optimal settings, but should be used with care in that its results are only as good as the supplied input. Examples: TCITSMCN40GGPMPLLA1_calc.csh -u TCITSMCN40GGPMPLLA1_calc.csh 30e3 500e6 TCITSMCN40GGPMPLLA1_calc.csh 30e3 500e6 1e-6

PLL Bandwidth Adjustment The loop bandwidth (BW) of the PLL can be adjusted using BWADJ[5:0]. The bandwidth is given by: BW = nom_BW*sqrt(NB_base / NB) where nom_BW is approximately given by: nom_BW = Fref / (NR*20) and Fref is the reference clock frequency. The damping factor (D) is approximately given by: D = nom_D*sqrt(NB_base / NB) where nom_D is approximately 1. Because the damping factor changes with bandwidth settings, the bandwidth is practically limited to: nom_BW/sqrt(2) < BW < nom_BW*sqrt(2) in order to limit the damping factor range to 0.7 - 1.4. The -3dB bandwidth (Fbw_3dB) is approximately given by: Fbw_3dB = 2.4 * nom_BW * (NB_base / NB) NB_base is NF * OD * Next for INTFB=0 and NF for INTFB=1. The recommended setting for NB is NB_base, which will yield the nominal bandwidth. Note that nom_BW and nom_D are chosen to result in optimal PLL loop dynamics.

PLL Bypass Mode The PLL has a bypass mode (BYPASS=1) where the reference clock (RCLK) is bypassed directly to the PLL output (CLKOUT) with no clock multiplication or deskew operation. BYPASS controls the PLLs output multiplexer that selects either the

TCITSMCN40GGPMPLLA1_guide
reference clock or the clock from the PLLs output divider. If the PLL is locked and BYPASS is asserted, then the overall power dissipation will be similar to what one would see under functional mode operation. However, if both power-down and bypass are asserted, then the overall power dissipation will be due mainly to a few buffers toggling in the bypass path. The power dissipation in this case will be very small.

PLL Test Mode The PLL has a divider test mode (TEST=1) to allow for rapid production testing of the dividers in the PLLs without using the internal analog circuitry. This mode (TEST=1) overrides the bypass mode (BYPASS=1). The output frequency Fout at CLKOUT is related to the reference frequency Fref by: Fout = Fref / NR / NF / OD The dividers typically implement an N+1 divide given an input of N. They are actually synchronous counters where the output is the carry-out signal. In divider test mode, the dividers are cascaded with carry-out signal of one driving the input clock of the next. When dividing by 1, the carry-out signal is fixed high. Thus, the PLL outputs will not toggle in divider test mode if any of the dividers have an input of 1. To perform rapid vector testing on the counters, they must be set to a known state so that the PLL output transitions can be tracked and compared against a predefined sequence. The PLL reference and feedback counters can be synchronously reset by asserting RESET in a specified vector sequence.

PLL Behavioral Modeling For system level verification a Verilog based behavioral model of the PLL is provided. The Verilog model predicts the behavior of the PLL closely but not perfectly. The model requires a sufficiently fine time-step to model locking behavior correctly. In steady state, the Verilog model does not model any jitter that might be present in the real PLL. To speed up simulation, during startup, the Verilog model is setup to achieve lock much faster than would be the case with the real PLL. After changes to the clock source frequency and/or runtime divider value, the number of cycles required by the Verilog model for re-lock is also less than the case with the real PLL. The real lock-time specification can be found in the "PLL Specifications" section. Upon deassertion of PLL reset, the PLL will proceed towards a locked state. During this transition, the PLL may output a few cycles that are higher in frequency than the final target frequency. This frequency overshoot modelled by Verilog may be different from the real overshoot. The real frequency overshoot specification can be found in the "PLL Frequency Overshoot" section below. In general, the chip operation should not depend on the behavior PLL output clock until the PLL is completely locked.

PLL Feedback Path The feedback path for the PLL must be able to propagate a clock frequency that is higher then the final PLL output frequency. This extra frequency range is required because while the PLL is in the process of locking, it can temporarily overshoot above the final output frequency. This temporary overshoot can cause the output clock period to be as much as 40% smaller than the final output clock period and the output clock high or low time to be as much as 50% smaller than the final output clock high or low time. If the feedback path is unable to propagate all clock edges at these higher frequencies for a particular desired operating frequency, the PLL will be unable to lock.

TCITSMCN40GGPMPLLA1_guide
In general, the PLL feedback path should be able to propagate clock edges at least 5us before the PLL reset signal is deasserted.

10

PLL Feedback Delay The feedback delay between the PLL output and PLL feedback input must be limited to avoid compromising the stability of the PLL. The maximum feedback delay that can be tolerated has a square root dependence on the reference frequency as listed in the table. The feedback divider (NF) inside the PLL has a zero effective insertion delay. Thus, no matching division is needed in the reference path. This divider is useful if the clock distribution output must be phase-locked to a multiple of the reference with zero added skew. See the "Deskew PLL Application Notes" section for more information on issues with the feedback path.

Minimizing Jitter The amount of period jitter observed will depend on the actual noise level on the PLL supplies and chip substrate and noise frequency content. It will increase roughly linearly with the output period and will be roughly independent of multiplication factor or bandwidth setting. To minimize the overall output jitter, the PLLs should be operated as close as possible to the maximum frequency before any output division. Thus if some division is necessary for the PLL, it should be performed by the OD divider rather than in a reference divider to maximize the VCO frequency. Since the PLL power dissipation increases with increased VCO frequency, there will be a trade-off between jitter performance and power dissipation. The overall tracking jitter can be minimized by increasing the divided reference frequency. The overall period jitter can be minimized by using an NF value that is as small as possible. In addition, to minimize the overall output jitter, the analog supplies should connect to separate dedicated pads. The PLL will work beyond the specified maximum frequencies, but the jitter performance will be degraded.

Timing Budgets The PLL specifications, along with those from the clock distribution network and the clocked elements, play a key part in chip timing budgets. When calculating the timing budgets, one may need to consider the worst-case static phase offset, duty cycle error, period jitter, and possibly tracking jitter from the PLL, the worst-case skew and jitter from the clock distribution, and the worst-case setup, hold, and clock-to-output times for the clocked elements. Period jitter is significant for setup time or cycle based path budgets but not for hold time or race path budgets. Clock distribution jitter is significant for setup time budgets but less for hold time budgets, depending on the clock distribution structure. Clock distribution skew is important for both setup time and hold time budgets. Static phase offset along with tracking jitter is significant for the setup and hold time budgets of latches or registers receiving data at the chip interface. Finally, duty cycle error must be considered for latch-based designs where the timing of both clock edges is significant.

PLL Power-up and Reset Operation

TCITSMCN40GGPMPLLA1_guide
In order to guarantee that the PLL will lock properly after startup or a counter change, follow the reset sequence described below. In practice, since the feedback path is enclosed, it will probably always oscillate from PLL power-up with a reasonable reference input frequency (not too high). However, this behavior should not be relied upon. When RESET is asserted, the PLL goes to a frequency between 10MHz and 100MHz depending on the PLL type. This frequency is not well controlled. Upon deassertion of RESET, the PLL output frequency will slew toward lock. The figure below is a timing diagram that shows the Power-Up and RESET sequence. ---+ | +------------------------------------------------------------^ | | t1 +------------------+ | | ----------+ +----------------------------------^ ^ ^ |<----- t_rst ---->|<---------- t_lock ---------->| | | | t2 t3 t4

11

PWRDN

RESET

t1 represents the later of the times when VDDA and VDD have reached their steady state levels or when PWRDN is deasserted. PWRDN can be tied low by default. The time relationship between t1 and t2 is arbitrary -- t1 can occur before t2 and vice versa. However, t3 must occur at least 5us after both t1 and t2. Normally, t2 will occur after t1, in which case t_rst (t3-t2) must be at least 5us. t2 can occur at time 0. If the PLL settings are changed while RESET is asserted, the settings should be changed least 5us before t3 (where RESET is deasserted). Once RESET is deasserted at t3, wait at least 500 divided reference clock cycles to ensure PLL has locked (t_lock interval). At t4 the PLL will be locked.

Timing Relationships for PLL Digital Signal Inputs The following describes the timing relationships for the digital signal inputs to the PLL. The timing relationship for PLL reset (RESET) is as follows: - should be asserted on power up - deassertion of PLL reset should occur 5us after any feedback counter values change (CLKF[] or external feedback counter inputs) - chip reset should be held for 500 divided reference clock (RCLK/NR) cycles to insure that the PLL is completely locked The timing relationship for feedback counter values (CLKF[], etc.) is as follows: - changes should occur 5us before PLL reset is deasserted

The timing relationship for PLL power down (PWRDN) is as follows: - can be asserted any time - deassertion should be followed by an assertion of PLL reset, etc. The typical chip startup sequence (non-testing mode), controlled by an external power-on reset signal, is as follows: - assert chip reset and PLL reset (RESET) based on power-on reset signal assertion

TCITSMCN40GGPMPLLA1_guide
- set feedback counter values if not static - power-on reset signal deasserts - if power-on reset signal is not guaranteed to be at least 5us wide after voltages have stabilized, then count 5us based on the reference clock (RCLK) - deassert PLL reset (RESET) - count 500 divided reference clock (RCLK/NR) cycles - deassert chip reset The following block diagram should implement reset signals for the typical chip startup sequence: power-on reset | +-----------+ | aset | | | 5us 0 --| D DFF Q |-------- PLL RESET | | | +------------------+ | clk | | count | +-----------+ | | | RCLK --| clk COUNTER co |---------/ | | | arst | +------------------+ | power-on reset power-on reset | +-----------+ | aset | | | 500*NR+5us 0 --| D DFF Q |-------- chip reset | | | +------------------+ | clk | | count | +-----------+ | | | RCLK --| clk COUNTER co |---------/ | | | arst | +------------------+ | power-on reset arst - asynchronous reset (for counter) aset - asynchronous set (for FF)

12

PLL Cycle Slip Detection A cycle slip occurs when the edge mis-alignment between the rising edges of reference and feedback clocks at the PFD input exceeds one reference (or feedback) clock cycle. The PLL does not have a classic analog lock detection circuit. Such circuits are weak points of PLL designs because they have double sided constraints. If their threshold is too low, they may never signal lock in a noisy environment. If their threshold is too high, they may signal lock too early. Instead, TCI provides two lower-level signals (FBSLIP and RFSLIP) which detect cycle slips between the VCO and reference clocks. Cycle slip detection circuits are desirable because they signal an out-of-lock condition immediately, not after some large number of cycles, caused by the low bandwidth of a classic lock detection circuit. These cycle slip output can be used to construct a highly effective lock-detect mechanism as described below.

TCITSMCN40GGPMPLLA1_guide
RFSLIP goes active for one or more divided feedback VCO cycles when the phase detector misses a divided reference cycle, i.e. when the VCO is running too fast. FBSLIP goes active for one or more divided reference cycles when the phase detector misses a divided feedback VCO cycle, i.e. when the VCO is running too slow. Neither signal is synchronized to the PLL output clocks. A "PLL lock" signal is usually used for one of two purposes: 1. Determining when to start the chip. We suggest using the previously described reference cycle count instead. Determining if the PLL has lost lock because of a reference frequency change or some other exceptional condition. In this case, the RFSLIP and FBSLIP signals can provide a low-latency way to detect a loss of lock.

13

2.

Since neither of these signals is synchronized to a PLL output clock, they must first be sampled by meta-stable-hardened circuitry before being processed by logic. We suggest a meta-stable-hard SR latch followed by two meta-stable-hard D flip-flops, as shown below: +-------+ CLK ---+-----------+ | | +----+ v v | RFSLIP|-->| | +------------+ +-------+ +-------+ | | | OR |---->|S | | CLK | | CLK | | FBSLIP|-->| | | SR-latch Q|-->|D DFF Q|-->|D DFF Q|--\ | | +----+ /->|R MHARD | | MHARD | | MHARD | | | PLL | | +------------+ +-------+ +-------+ | +-------+ | | | +------------------------------------+ | | | | | \--|ACK Out-Of-Lock Resolution Logic DET|<-/ | | +------------------------------------+ The "Out-Of-Lock Resolution Logic" is responsible for handling the loss of lock. Possible actions taken by this logic would be to reset the PLL, change multiplication ratios, and/or switch to an alternate clock source. Once the circuit acknowledges the condition, it resets the SR latch. Two of the D flip-flops and the SR-latch must be meta-stable hard as indicated ("MHARD"). More than two cascaded meta-stable hard D flip-flops may be used if necessary. Consult with the cell library usage guidelines for synchronization circuits and metastability.

Cascaded PLL Configurations In some applications it makes sense to cascade PLLs. Such a configuration might involve PLLs on the same chip or PLLs on different chips. PLLs can be cascaded to provide greater frequency resolution, without sacrificing long-term jitter due to a high multiplication ratio. For any PLL, due to stability constraints, its maximum bandwidth is limited to a fraction of the divided reference frequency. This PLL internally scales its bandwidth to be a precise fraction of the divided reference frequency, thus achieving a near maximum bandwidth in all modes of operation. The long-term jitter from worst case supply/substrate noise will scale inversely proportional to the PLL bandwidth, as a result of phase error that accumulates before the PLL is able to correct for the noise events. As such, for this PLL, the worst-case long-term jitter will scale with the divided reference period. To achieve high frequency resolution, there are three possible strategies. A single PLL can be used with large NF and NR divider values to form a high value ratio. This strategy is simple, but it might produce too much long-term jitter for some applications due to the low divided

TCITSMCN40GGPMPLLA1_guide
reference frequency. A second strategy is to use a single PLL in a fractional N configuration, provided that the dither jitter is not too large. A third strategy is to use cascaded PLLs, each PLL using small NR and NF divider values to form a low value ratio, but collectively implementing a high value ratio. This configuration has the advantage of avoiding low divided reference frequencies that can increase long-term jitter. The long-term jitter will be dominated by the PLL operating at the lowest bandwidth. The key issue with cascaded PLL configurations is preventing jitter amplification. Each PLL will amplify jitter a little which occurs at noise frequencies near the PLL bandwidth. Cascading two or more PLLs operating with the same loop bandwidth can lead to increased jitter amplification. To avoid jitter amplification, the PLLs should operate at different loop bandwidths. Specifically, any PLL in the cascaded chain will attenuate jitter amplification from PLLs that precede it which are operating at higher bandwidths. Ideally, the lowest bandwidth PLL should be last in the chain, but since some net clock multiplication is commonly needed, which necessitates a low bandwidth, the first PLL will typically have the lowest bandwidth. It is desirable to make the bandwidth at least a factor of two or four apart.

14

Chip Integration Issues - Distributed MUXes Distributed multiplexers can be used to avoid undesired cross talk between multiplexed clock signals. Cross talk can lead to increased jitter in the resultant output clock signal. The idea behind distributed multiplexing is to gate each clock signal at its source and combine the gated clocks at the output destination, each performed with a gain stage for noise isolation. The destination may be shared with one of the sources. By gating the clock signals at their sources, cross talk or feed through at the point where the clock signals are combined is minimized. - - - - - - - - - - - - - - - - - - - - - - - - - - - - Source 1 Location | Output Location

+-----\ | Clock_1 ---| | | NAND |O----------+ Sel_1 ---| | | +-----/ | | +-----\ +---| | - - - - - - - - - - - - - - | | NAND |O--- Clock_Out +---| | +-----\ | | +-----/ Clock_2 ---| | | | NAND |O----------+ Sel_2 ---| | +-----/ | Source 2 Location |

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - Clock Distribution TCI recommends that the reference clock be treated as carefully as any other clock signal on the chip. Ideally, the reference clock pad should be located close to the PLL. When the pad cannot be close, the latency should be minimized using wide shielded wires and large inverting buffers. To minimize threshold modulation of the insertion delay, the slew time at the input of every gate in the transmission path should be less than or equal to that of an inverter driving fanout of 6. Lateral shielding of the clock wires should be employed, and the clock should be routed

TCITSMCN40GGPMPLLA1_guide
to curtail the number of signals crossing over and under it in adjacent metal. For example, consider an RCLK path of 10mm from ESD pad to PLL in a typical 130nm process. We recommend that the path be cut into 7 sections each 1.5mm long. Each section should be routed using 2um wide upper-layer metal and 0.5um VSSD shields on either side spaced by 0.5um, and driven by a 80um/40um inverter. 5x6 via arrays should be used when switching layers. Suppose that each wire segment would have a resistance of 32 ohms and a capacitance of 838 fF. The 30%-70% worst-case slew time seen at the end of each wire segment would be about 55ps, and the worst-case stage delay would be about 75 ps. The insertion delay of the RCLK path is then 525 ps. If we assume that +/-5% dynamic supply variation will cause +/-5% delay modulation, this insertion delay will give 53 ps of peak-to-peak jitter, which is sufficiently low to drive a 380 MHz reference clock from a crystal reference. This example is intended to illustrate that, while it is possible to transmit a clock across a fair amount of silicon, it requires work that can otherwise be avoided. Please note that a 80um/40um inverter is 4x larger than the maximum inverter size in widely available standard cell libraries. To reduce the local power supply drop from the inverter, it should have a narrow aspect ratio (so that it connects to many metal straps in the power grid), and it should have decoupling capacitance adjacent to the inverter cell.

15

PLL Analog Supplies The PLLs VDDA and VSSA supplies should be connected to dedicated pads near the PLL supplied by an off-chip filter network. This configuration will result in the specified jitter performance. When multiple PLLs are used, each PLL should have its own VDDA and VSSA pads or bumps which are isolated from those of other PLLs. Also, each PLL should have its own package pins and filter network. However, if necessary, the PLL analog supplies can be shorted in the package when this short leads to a lower inductance and better isolated solution. There are no power sequencing requirements for VDDA and VDD from the PLL. However, the supply pads used for the VDDA and VDD supplies may impose power sequencing requirements. Please consult the supply pad documentation.

Chip Layout Guidelines PLL integration layout is primarily concerned with three things: substrate noise, supply resistance, and analog supply noise coupling. The PLL is an analog system, which means that compromising any of these things will compromise the jitter performance of the PLL. - ESD Protection The PLL has no built-in ESD protection. The customer must provide provide standard ESD protection for all PLL power, input, and output chip pins. - Substrate Noise The PLL block is ideally located on the edge of the die adjacent to the ESD structures for the PLLs analog supply pads. It should not be placed near groups of output drivers, but rather near input receivers, power supply pads, or test pins where the substrate is likely to have the least noise. - Supply Resistance

TCITSMCN40GGPMPLLA1_guide
The total resistance from the analog supply bumps or pads to the PLL pins should be 1 ohm or less for each of VSSA and VDDA. All of the PLLs analog supply pins should be connected with the same or greater width metal in multiple metal layers, and should be routed directly to the PLLs analog supply pads. The recommended practice is to stack VDDA and VSSA on adjacent metal layers, and to use more than one metal layer for each of these supplies, so that the total metal width for each supply is 30um. Note that for C4 (flip-chip) packaged chips, the wires between the ESD structures and the bumps add to the resistance of the wires between the PLL and the ESD structure, making it somewhat more challenging to ensure 1 ohm or less for each of VSSA and VDDA. Total resistance from digital supply bumps or pads to the digital supply pins at the edge of the PLL should be 2 ohms or less for each of VSS and VDD. Additionally, resistance between the PLLs digital supply pins and the source/drains of the clock tree buffers should be less than 2 ohms. This low resistance connection is important so that the output signal switches supply domains inside the PLL where the edge rates are controlled to minimize added jitter. Generally, the PLLs digital supply wires should be connected with at least 5um wide wires in one metal layer. - Analog Supply Noise Coupling This PLL is designed to shield sensitive internal nodes from outside influences with guard rings and a mat of VDDA/VSSA metal. However, the analog supplies are themselves sensitive to external noise. In general, the ONLY thing that can shield the analog supplies from aggressor signals is space. Digital VDD/VSS, in particular, are aggressor signals. However, the realities of chip integration will often preclude blank space in many particular contexts listed below, and so this section is a guide to making the inevitable compromises. For wire-bonded chips with pad frames, the pads adjacent to the PLL analog supplies should be either quiescent I/Os (e.g. test pins not used in production) or other voltage supplies. On C4 packaged parts, similar placement constraints for the ESD structures should be observed. The ESD structures for VSSA and VDDA should be isolated from other ESD structures in the same way that ESD structures for I/Os with differing supply voltages are normally isolated, even when the PLL analog supply voltage is the same as the adjacent supply voltage. Usually this isolation is done with a diode cell between the PLL supply ESD structures and the adjacent ESD structures to break the supply rings. Note that the VSSA and VDDA supplies are directly connected to transistor gates inside the PLL. Some I/O libraries include an "analog power supply" pad that has no ESD break-down path to the main chip supplies, which assumes that the VSSA and VDDA nets do not connect directly to transistor gates. Do not use these ESD pads with a TCI PLL. - C4 (Flip-Chip) Packaging ..................... ..................................... . (I/O Cells) . . . ..................... . . . -. . ---.........../ \.... . / \ / \ / \ . | | . .| | | | | | ..........| |... .|SPARE | | JTAG | | VDD | . JTAG \ / . . \ / \ / \ / ............ -- ..... . ---: diode break : . . ..................... . PLL . . VDDA .====. . ............ -- ..... . ---. VSSA / \ .====. / \ / \ / \ ..........| |... .| | | | | | : diode|break | : .| VSSA | | VDDA | | VSS |

16

TCITSMCN40GGPMPLLA1_guide
...........\ /.... . -. ..................... . . ..................... . . ............ -- ..... . / \ . ..........| |... . | | . ...........\ /.... . -. ..................... . \ / \ / \ / . ---. . . . ..................................... -/ | | \ -/ \ | | | | \ -/ / -\ | | | | \ -/ / -\ | |

17

Example of C4 chip floor plan with PLL. On C4 (flip-chip) packaged chips, the PLL usually ends up below several bump positions (see figure). Noise or signals on these bumps will couple vertically into the analog power supply shields over the PLL. In order of desirability, these bumps should be assigned to: Analog VDDA/VSSA Unused or spare Static (unchanging) digital inputs (i.e. configuration or test pins) PLL output clocks (NOT reference clocks) Digital VDD/VSS

It is particularly desirable to place analog VDDA/VSSA over the PLL and close to the ESD structures, since those bumps will couple to the digital power grid if they are placed anywhere else. It is better to have the PLL close to the ESD structures, so that the power routes are short and the analog supply resistance can be low, than to have the PLL far from the ESD structures in order to avoid placement under bumps. Metal on the redistribution layer over the PLL should only be used to reach bumps over the PLL. If, despite the above suggestions, bumps or redistribution layer routes over the PLL are assigned to changing digital signals, those aggressors must be shielded. A single layer of digital VDD or VSS under the aggressor should be used to shield the PLL and its analog supplies from the effects of rail-to-rail swings. Digital VSS is the preferred shield supply (and analog VDDA and VSSA are to be avoided entirely). This single layer should be on the HIGHEST metal layer possible and only under the aggressor, not over the whole PLL. Multiple layers of shielding are counterproductive since they just bring the noise on the digital supplies closer to the PLL and its analog supplies. - PLL Placement in I/O Pad Ring Area Placement of the PLL in the I/O pad ring area is possible, but not a preferred approach due to the increased noise coupling from I/O supplies. If the PLL is placed in the I/O pad ring area, no signal pads or power I/O pads should be placed over the PLL area. I/O filler cells can be placed over the PLL area in order to connect the power and ESD break-down rails and to maintain proper ESD protection. The filler cells will result in power routing over the PLL area. To reduce noise coupling into the PLL these filler cells should use top metal layer for routing the power rails. If feasible, to further reduce noise coupling into the PLL, a core VSS shield should be placed over the PLL in order to isolate it from noise on the power rails routed over it. This shield should be in the highest level metal below that used to connect the power rails. Other than the power rail routing in the filler cells and the VSS shield underneath, only floating metal fill should be placed over the PLL area. To minimize noise coupling and series interconnect resistance, I/O pads adjacent to the PLL should be assigned to the PLLs VDDA/VSSA pins. The PLLs core VDD/VSS power

TCITSMCN40GGPMPLLA1_guide
pins can be connected to the nearest point on the chips core VDD/VSS power mesh. - Other Layout Issues No signals or supplies should ideally be routed over the PLL. Metal fill above the PLL should be left floating, and cut along the PLL boundary to minimize capacitive coupling from outside the block through the metal fill. Typically, the PLL hard macros are supplied with the metal fill for upper layers already in place, so this constraint should not be a problem. The PLL analog supplies should be routed at least 15um away from other supplies or signals on all metal layers. Specifically, we strongly discourage routing the analog supplies under, over or adjacent to the digital supplies. If the analog supplies must cross digital supplies, it is best to have at least one empty metal layer between them, and they should cross orthogonally. Digital supplies are specifically NOT to be used to shield the analog supplies from signals -- physical distance is the only thing that works. The PLL has an internal substrate isolation ring, which means that no additional isolation is needed beyond the boundary of the block. No wires of any kind should be routed over the PLL block. No signals in the RCLK signal path, from a possible pad receiver to the PLL input, should have a slew time greater than that of a fanout of 6 inverter, as measured at the input of the next gate in the chain. TCI requires the reference clock to have less than 2% P-P long-term jitter at the RCLK pin of the PLL. Crystal oscillators are typically quoted to have a timing accuracies of a few parts per million. This timing accuracy relates to the error in the average frequency from the specified value, and should not be confused with jitter. Short-term reference clock jitter, caused by delay modulation due to supply variation or crosstalk coupling, is less of a problem because the PLL will filter out most of the high frequency reference clock jitter components. However, the low frequency components that are below the PLL bandwidth will be passed straight through to the output. Ideally, the RCLK pad receiver should be located near the PLL so that the RCLK insertion delay (delay from the RCLK pad to the PLL pin) is small. If the RCLK pad is more than 1 mm away from the PLL, care should be taken to minimize the insertion delay and crosstalk to RCLK. Refer to the "Clock Distribution" section above.

18

Package and Board Guidelines The PLL supply wires inside the package should not be routed near active I/O signal wires or I/O supply wires, but instead near static input signal wires or core supply wires. The PLLs two analog supplies should be filtered with two series ferrite beads and two shunt 0.1uF and 0.01uF capacitors. The ferrite on VSS is preferred but optional. Adding the ferrite on VSS converts supply noise to substrate noise as seen by the PLL. The PLLs are designed to be relatively insensitive to supply and substrate noise, so the presence of this ferrite is a second order issue. VDD -----@@@@@-----+------+------ VDDA ferrite | | ----0.1uF ----- 0.01uF | | VSS -----@@@@@-----+------+------ VSSA ferrite

TCITSMCN40GGPMPLLA1_guide
The ferrite beads should be similar one of the following from Murata: Part number R@DC Z@10MHz Z@100MHz Z@1GHz size ---------------------------------------------------------------BLM18EG601SN1 * 0.35 200 600 0603 BLM18PG471SN1 0.20 130 470 0603 BLM18KG601SN1 0.15 160 600 0603 BLM18AG601SN1 0.38 180 600 0603 BLM18AG102SN1 0.50 280 1000 0603 BLM18TG601TN1 0.45 190 600 0603 BLM15AG601SN1 BLM15AX601SN1 * BLM15AX102SN1 BLM03AX601SN1 * preferred choice Similar ferrite beads are also available from Panasonic. The key characteristics to select are: - DC resistance less than 0.40 ohms - impedance at 10MHz equal to or greater 180 ohms - impedance at 100MHz equal to or greater than 600 ohms The capacitors should be mounted as close to the package balls as possible. 0.60 0.34 0.49 0.85 200 190 250 120 600 600 1000 600 0402 0402 0402 0201

19

PLL Bench/Performance Testing Procedure This section describes lab testing procedures for the PLL. these tests would not be done in production testing. - General testing requirements 1. 2. The input clock RCLK should be controllable from off-chip. At least one output clock should be accessible from off-chip. This output clock should be the highest frequency PLL output used on the chip. However, if there is concern about I/O bandwidth limitations, a lower frequency output or divided version can be used. Having off-chip access to RCLK driven from the chip (path is from off-chip, into chip, to PLL, then out of the chip) is also useful for jitter characterization. To conserve package pins, a multiplexer could be used to select between RCLK (at the input of the PLL) and CLKOUT for off-chip observation. The output used for this multiplexer can also be shared with other test circuits. Need easy control over the pins: - RESET - PWRDN - BYPASS Ideally these inputs would directly connect to package pins. They can be shared with other PLLs or other circuits. If it is not convenient to make them directly accessible from package pins, then they can be controlled by configuration register state. Other configuration inputs for the PLL can be controlled by configuration register states. These inputs include: - CLKR[0:3] - CLKF[0:5] - CLKOD[0:3] - BWADJ[0:5] - INTFB - TEST Typically

3.

4.

TCITSMCN40GGPMPLLA1_guide
5. IMPORTANT: Any configuration register state used must controllable without the PLL output clock functioning.

20

- Optional testing features 1. A multiplexer can be added at the PLL output to select between a true and inverted clock and drive an off-chip observation point for duty-cycle measurements. A multiplexer can be added to before the RCLK input of the PLL to select between a true and inverted clock. This multiplexer will facilitate phase-step measurements.

2.

- Normal closed-loop PLL tests A number of tests can be performed on the PLL to measure their closed loop performance level. The first sets of tests focus on basic operation in a noise-free environment. The second set of tests focus on noise sensitivity. 1. Basic tests a. Make sure PLL locks correctly - for production testing, frequency only measurement - for lab testing, see if the waveform is locked b. Measure maximum frequency range - for system lab testing, tap off of clock tree output - for production test, tap off of PLL directly - force the VCO to rail out by applying the a large input frequency - could lower supply voltage for PLL if frequency is too high - this measurement will make sure PLL has adequate frequency range Over desired frequency range, measure - operate only at DESIRED frequencies for production testing 1. static phase offset - only measure if interested in it 2. power dissipation (supply current) 3. duty cycle - only measure if interested in it 4. period jitter and input-to-output jitter (no noise) a. period jitter is measured by triggering oscilloscope with PLL output and measuring jitter on next cycle edge of PLL output b. input-to-output jitter is measured by triggering oscilloscope with PLL input and measuring jitter on corresponding edge of PLL output - may not need to measure (low bandwidth) 5. lock time

c.

2.

Noise tests - not for production testing - can do minor re-work on system test board a. Apply noise to both supply and substrate for (b) and (c) by 1. removing any bypass capacitors on PLL supplies 2. insert 5-10ohm resistors in series with PLL supplies 3. add 100ohm resistors between noise source and PLL supplies 4. for supply noise, only drive 100ohm resistor on VDDA 5. for substrate noise, drive both 100ohm resistors on VDDA and VSSA 6. provide a way of measuring VDDA/VSSA waveform to determine actual noise amplitude and edge rates - can do a more focused testing strategy in lab after observing part - substrate noise is less precise because of unknown noise magnitude - can just use an oscilloscope b. Pulse noise tests (square wave) 1. apply a square wave as noise source with supply peak-to-peak

TCITSMCN40GGPMPLLA1_guide
2. 3. 4. 5. c. amplitudes from 5% to 20% VDDA sweep noise frequency from 1KHz to PLL reference frequency determine worst-case peak-to-peak jitter (RMS jitter is meaningless for this test) and the frequency where it occurs measure for both supply and substrate noise measure both period jitter and input-to-output jitter

21

Sine wave noise tests 1. apply a sine wave as a noise source with supply peak-to-peak amplitude of 10% VDDA 2. sweep noise frequency from 1KHz to PLL reference frequency 3. plot the peak-to-peak jitter (RMS jitter is meaningless for this test) over the whole frequency range 4. measure for both supply and substrate noise 5. measure both period jitter and input-to-output jitter

Note that the worst-case jitter will be observed at noise frequencies near the PLL loop bandwidth. Also, the measured period and long-term jitter will be higher at lower VCO frequencies and will progressively get smaller as VCO frequency increases. However, TCIs jitter specifications cover the worst-case across the entire specified VCO operating frequency range. - Special PLL tests If problems are suspected in the PLL operation, the additional testing features allow internal parameters of the PLL to be measured. Note that given the adaptive nature of the circuits, the VCO and loop filter parameters are not tightly controlled and will vary with process corner and temperature. While the characteristics of each block (VCO, charge pump, etc.) are sensitive to the process and temperature, these sensitivities drop out of the combined transfer characteristics, leading to loop dynamics are independent of process, voltage, and temperature. However, the maximum operating frequency will be sensitive to process, voltage, and temperature. 1. Closed Loop Tests a. Phase step response 1. need a circuit to invert or step the phase of the input clock -- Ideally it would be controlled by a configuration signal 2. trigger the oscilloscope in one-shot mode with the same phase inversion signal 3. lock the PLL at the desired operating frequency 4. invert the phase of the input clock 5. measure the rising edge crossing locations 6. plot the rising edge crossing locations b. Synchronous phase step measurements 1. same as above, but also divide the PLL output by number on the order of 256 and use it to trigger the scope and invert the phase 2. scope can be in normal trigger mode c. Bandwidth measurements 1. loop bandwidth can be measured from the phase-step responses or from the output spectrum (spectrum analyzer measurement) Testing configuration a. It may be desirable to measure the the jitter characteristics, phase-step response, or output spectrum using different settings. In general, all settings used in normal chip operation should be tested. b. Parameters to vary 1. counter values: - CLKR[0:3] - CLKF[0:5] 2. bandwidth setting: - BWADJ[0:5]

2.

TCITSMCN40GGPMPLLA1_guide
3. operating frequency

22

PLL Production Testing Guidelines 1. Functional closed loop testing will be employed to test the analog circuitry within the PLL for two extreme input frequencies via NR and NF settings. This functional test establishes both the VCO operation and the frequency/phase-lock operation at both high and low VCO frequency corners. The configuration of the clock multiplication divider should be such as to exercise the worst-case speed digital divider path during the high-speed VCO functional test. 2. With TEST and PWRDN asserted the PLL is also configured into "divider test mode." The analog sections of the PLL are effectively powered-down. The PLLs digital dividers are configured as a chain. The output frequency is (CLKOUT freq) = (REF freq) / NR / NF / OD By connecting a clock source of known frequency to the REF input and monitoring the frequency of the CLKOUT output, the functionality of the digital circuitry can be determined. The NR, NF, OD, NB divider circuits do not require any initialization. However, the initial state of the dividers is undetermined unless a reset sequence is used. PLL production tests: Pin access: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. RCLK: Pin access. FCLK: Pin access. CLKOUT: Pin access. Divide appropriately to an output pin so frequency to test is below 150MHz. CLKR[0:3]: Register or pin access. CLKF[0:5]: Register or pin access. CLKOD[0:3]: Register or pin access. BWADJ[0:5]: Register or pin access. RESET: Pin access (or some way of guaranteeing RESET timing is met). PWRDN: Register or pin access. INTFB: Register or pin access. BYPASS: Register or pin access. TEST: Register or pin access. RFSLIP: Optional pin access if desired (after dividers, if required). FBSLIP: Optional pin access if desired (after dividers, if required). Tests: 1. Lock PLL and perform frequency counting: a. b. c. Maximum spec. frequency with output divider set to keep measurement frequency below 150MHz. Minimum spec. frequency with output divider set to keep measurement frequency below 150MHz. Application frequency (use additional dividers to keep measurement frequency below 150MHz, if required).

Since the user can connect a divider to FCLK, locking will depend on this divider. 2. RESET test: Pulse reset for 5us and measure frequency within spec.

3. Divider test: When TEST=1, the 3 dividers (NR, NF, OD) are connected in series such that RCLK is divided by NR*NF*OD and the output is available at CLKOUT. This divider will be tested using a slow-speed functional test. To reset the counters:

TCITSMCN40GGPMPLLA1_guide
a. b. c. d. e. Set CLKR=0001, CLKF=000001, RESET=1, wait 1 RCLK cycle. Set RESET=0, wait 1 RCLK cycle. Set RESET=1, wait 1 RCLK cycle. Set CLKOD=0000, RESET=0, wait 63 RCLK cycles. Stop clock and load desired values to test dividers. (next RCLK cycle will load all counters and force CLKOUT high)

23

After resetting the counters to the starting state, test the counters as follows: a. b. c. d. e. f. g. h. Set NR, NF, and OD to desired test values, wait 1 RCLK cycle. Check that CLKOUT changed from low to high (may already be high after reset). Wait NR*NF RCLK cycles. Check that CLKOUT changed from high to low. Wait NR*NF*(OD-1)-1 RCLK cycles. Go to step "a" for next test iteration. Wait 1 RCLK cycle. Check that CLKOUT changed from low to high.

All input changes should be when RCLK goes low to avoid setup or hold time issues. The testing time for each set of settings is proportional to the product of the input values for all counters. The total testing time can be minimized by testing one counter at a time, where all but one counter input value is set two. The counter under test can simply be tested with input values to be used by applications. To obtain more complete coverage, each bit of the counter input value can be independently set in separate tests. This method will insure that logic attached to each input functions correctly. Because internal state of the counter is periodic in nature and will have redundant states between tests, complete coverage is possible. The total testing time will then be proportional to the sum of the maximum counter values (instead of the square of the product if all input values were independently tested). 4. 5. Scan test of system logic with BYPASS=1. Iddq test of system logic with BYPASS=1 and PWRDN=1.

Additional Notes 1. No external loop filter components are required for this PLL. 2. Although an on-chip voltage regulator is not explicitly specified and a good PSRR should be inherent in the design of the delay stage, a regulator option is left open to the designer to facilitate meeting jitter requirements if needed. 3. The MUXes that select clocks external to the PLL for testing and feedback modes should be NAND based and not Transmission Gate MUXes (library MUXes should not be used). The MUX function should be distributed and the control signal run along with the internal "distributed" MUX node to minimize the high frequency cross talk associated with co-located macro inputs and shared VDD/VSS connections. An example of this configuration is presented earlier in the "Distributed MUXes" subsection. 4. Since there will be large numbers of SSO producing output buffers, there will be two types of SSO noise that are addressed with this design. The first is the normal core VDD / VSS bounce that can be stabilized with core capacitance and the second is the reflected negative voltage pulse pumping current into the substrate. The latter SSO term modulates the Vt of the Nch transistors thereby creating a change in delay (jitter).

Addendum to Line Item Definitions of Specifications

TCITSMCN40GGPMPLLA1_guide
- Divided reference frequency range: The divided reference frequency (Fint) is defined as: Fint = Fref/NR where Fref is the input reference clock frequency, and NR is the reference divider value. Fint is the rate at which the PFD updates. - /1 output frequency range: This frequency range can also be interpreted as the allowed VCO frequency range when PLL is locked. - Output divider values: The output divider is outside the PLL feed-back loop. - Bandwidth adjustment div. range: These affect both 3-dB band-width (Fbw_3dB) as well as damping factor (D) of the PLL. Fbw_3dB should be set to less than Fint/10. D should be set to between 0.7 and 1.4. - Feedback signal delay (max): Exceeding this maximum feedback delay can result in poor dynamic performance of the PLL, which can manifest itself as large jitter or inability to lock. - Period jitter (P-P) (max): Typically, period jitter measurements should be performed over at least 100,000 consecutive output clock periods. Ideally, at least 10*Fout/Fbw_3db consecutive periods should be measured. The specification assumes worst case on-chip operating condition. Actual measured data is expected to result in much smaller value. For formal definition of period jitter refer in the "Definitions" section. - Input-to-output jitter (P-P) (max): Typically, input-to-output jitter measurements should be performed over at least 100,000 consecutive output clock edges. Ideally, at least 10*Fout/Fbw_3db consecutive edges should be measured. The specification assumes worst case on-chip operating condition. Actual measured data is expected to result in much smaller value. If the reference clock is noise free, input-to-output jitter and long-term/accumulated jitter measurements are identical. For formal definition of input-to-output jitter refer in the "Definitions" section. - Power dissipation (nom): The total power dissipation scales approximately linearly with VCO frequency and will show approximately 20% P-P variation across PVT.

24

Definitions Output Jitter Definitions Output clock jitter can be measured in a number of ways. It can be measured relative to absolute time, to another signal, or to the output clock itself. The first measurement of jitter is commonly referred to as absolute jitter or long-term jitter. The second is commonly referred to as tracking jitter or input-to-output jitter when the other signal is the reference signal. If the reference signal is perfectly periodic such that it has no jitter, absolute jitter and tracking jitter for the output clock are equivalent. The third is commonly referred to as period jitter and/or cycle-to-cycle jitter. Output jitter is typically reported as RMS or peak-to-peak jitter. RMS jitter is interesting only to applications that can tolerate a small number of edges with large time displacements that are well beyond the RMS specification with gracefully degrading results. Such applications

TCITSMCN40GGPMPLLA1_guide
can include video and audio signal generation. Peak-to-peak jitter is interesting to applications that cannot tolerate any edges with time displacements beyond some absolute level. The peak-to-peak jitter specification is typically the only useful specification for jitter related to clock generation since most setup or hold time failures are catastrophic to the operation of a chip. Mathematical definitions of the various jitter terms are described in the sections below. - Jitter types: Tjit_per_pp = peak-to-peak period jitter measured over N consecutive edges of output clock. Tjit_per_rms = RMS period jitter measured over N consecutive edges of output clock. Tjit_cyc_pp = peak-to-peak cycle-to-cycle jitter measured over N consecutive edges of output clock. Tjit_cyc_rms = RMS cycle-to-cycle jitter measured over N consecutive edges of output clock. Tjit_lt_pp = peak-to-peak long-term/absolute jitter from N consecutive measurements carried out at i-th edge of output clock. Tjit_lt_rms = RMS long-term/absolute jitter from N consecutive measurements carried out at i-th edge of output clock. Tjit_trk_pp = peak-to-peak tracking/input-to-output jitter from N consecutive measurements carried out at i-th edge of output clock. Tjit_trk_rms = RMS tracking/input-to-output jitter from N consecutive measurements carried out at i-th edge of output clock. - Intermediate quantities: Eout(i) = edge time of i-th edge of output clock. Eref(i) = edge time of reference clock that corresponds to i-th edge of output clock. Tout(i) = period at i-th edge of output clock. Tjit_per(i) = period jitter measured at i-th edge of output clock. Tjit_cyc(i) = cycle-to-cycle jitter measured at i-th edge of output clock. Tjit_lt(i) = long-term or absolute jitter measurement at i-th edge of output clock. Tjit_trk(i) = tracking or input-to-output jitter measurement at i-th edge of output clock. - Notation used: Sum[X(N)] = Sum[X(N)^2] Avg[X(N)] = Avg[X(N)^2] Std[X(N)] = Max[X(N)] = Min[X(N)] =

25

X[1] + X[2] + ... + X[N] = X[1]^2 + X[2]^2 + ... + X[N]^2 Sum[X(N)]/N = Sum[X(N)^2]/N sqrt(Avg[X(N)^2] - Avg[X(N)]^2) max(X[1], X[2], ... X[N]) min(X[1], X[2], ... X[N])

- Jitter definitions: Tout(i) = Eout(i) - Eout(i-1) Tjit_per(i) = Tout(i) - Avg[Tout(N)] Tjit_per_pp = Max[Tout(N)] - Min[Tout(N)] Tjit_per_rms = Std[Tout(N)]

TCITSMCN40GGPMPLLA1_guide
Tjit_cyc(i) = Tout(i) - Tout(i-1) Tjit_cyc_pp = Max[Tjit_cyc(N)] - Min[Tjit_cyc(N)] Tjit_cyc_rms = Std[Tjit_cyc(N)] Tjit_lt(i) = Eout(i) - i*Avg[Tout(N)] Tjit_lt_pp = Max[Tjit_lt(N)] - Min[Tjit_lt(N)] Tjit_lt_rms = Std[Tjit_lt(N)] Tjit_trk(i) = Eout(i) - Eref[i] Tjit_trk_pp = Max[Tjit_trk(N)] - Min[Tjit_trk(N)] Tjit_trk_rms = Std[Tjit_trk(N)] -------------------------------------------------------------------------------

26

TCITSMCN40GGPMPLLA1_guide
Deskew PLL Application Notes ============================= 1. Introduction

27

For interfacing signals in and out of the chip, it is often required that these signals be synchronized to a common clock source that resides outside the chip. To comply with these interface standards, each chip needs an internal clock that is phase-aligned with the external clock and sequences the on-chip I/O circuitry. In many cases, the off-chip clock driver cannot drive the large on-chip clock loads directly and the external clock must be buffered first before use. These on-chip clock buffers have finite delays, which introduce a phase offset between the external and internal clocks. Unfortunately, it is very difficult to predict this delay prior to the chip fabrication due to process variability and the resulted phase difference between the external and internal clocks is almost arbitrary (Figure 1a). Phase-locked loops can synchronize these clocks by effectively hiding the delay of the clock buffers. A phase-locked loop (PLL) is a feedback system that aligns the phases of the two input clocks, FCLK and RCLK as denoted in Figure 1b. A PLL achieves this alignment by adjusting the phase of its output clock, which feeds to the clock tree input in this case. Thanks to this feedback operation, the clock at the end of the clock tree (FCLK) is phase-aligned with the external clock (FCLK) and the synchrony is maintained. The deskew PLL specified in this document can keep the timing skew between the RCLK and the FCLK clocks within +/-2% of the /1 output period. Figure 1. Deskew PLL Overview

___ | | | | ... |_^_| | +---> FCLK ____________ | | | | RCLK >---->--------------------------| clock tree |- - - -+-- .... |____________| (a) Simple clock buffering: phase relationship between RCLK and FCLK is unknown and can be arbitrary as the clock tree delay can largely vary due to process variation. ___ | | | | ... |_^_| | | +--------------------------------------------- - - --+ | ____________ | | +---------------+ | | | FCLK +---->| CLKOUT |->-------| clock tree |- - - -+-- .... | PLL | |____________| RCLK >---->| | +---------------+ (b) Deskewing using a phase-locked loop: FCLK is phase-aligned with RCLK and the delay of the clock tree is effectively nulled out.

2.

Features of the Deskew PLL

In addition to deskewing the phase difference between the external and

TCITSMCN40GGPMPLLA1_guide
internal clocks, the deskew PLL have some others features for the users convenience: the built-in feedback, reference, and output clock dividers. Sometimes, the internal clock frequency has to be a multiple of the external clock frequency. Phase-locked loops can perform frequency multiplication by inserting a clock divider (denoted as /NF) in the feedback path (FCLK). The clock divider divides the clock frequency down by an integer number and now the PLL tries to phase-align this divided clock (DCLK) with the divided external clock (RCLK) (since the external reference clock is divided by the NR divider). As a result, the frequency of the PLL output clock will be the same integer-multiple of the divided external frequency. In deskew PLL, the divide-ratio can be 1 to 64 configurable by the 6-bit CLKF input (Figure 2). The users can choose to use external clock dividers if different divide-ratios are desired. However, the best performance of this PLL is achieved via the use of the internal clock divider (/NF) because it is specially-designed to hide its inherent delay. In Figure 2, the delay through the internal clock divider (/NF) is effectively zero so the delays from the FCLK and RCLK inputs to the phase-frequency detector (PFD) are identical. For stable operation of the PLL, it is required to keep the feedback path delay shorter than the specified amount. Incautious use of external clock dividers may make the PLL fall in the unstable state where the timings of the output clocks become extremely sensitive to noise. External clock dividers also require careful delay-matching. Figure 2. Additional features of the deskew PLL: built-in reference, feedback, and output divider.

28

+-----------------------------------------------------------+ | | | +-----+ DCLK +-----+ +-----+ | FCLK --|---->| |----->| | | | +-----+ 0|\ | | | /NF | | PFD | ... | VCO |----->| |-->| |---|-CLKOUT CLKF --|-/-->| | /-->| | | | | /OD | | | | | 6 +-----+ | +-----+ +-----+ /-->| | | | | | | | +-----+ | | | | | | 1| | | | /------------C------------------------C------------>| | | | | | | |/ | | | +-----+ | | ^ | RCLK --|-+-->| |--/ | | | | | /NR | | | | CLKR --|-/-->| | | | | | 4 +-----+ | | | | | | | CLKOD--|-/-------------------------------------/ | | | 4 | | | | | BYPASS-|------------------------------------------------------/ | | | +-----------------------------------------------------------+

3.

Dynamic Behavior of the PLL

Clock jitter means the dynamic variation of the clock timing and it is of primary concern since the uncertainty in timing can limit the maximum frequency of systems operation. The jitter is contributed both by the VCO itself and by the application-dependent circuits on the output/feedback path (e.g. clock distribution trees) and thus it is important for designers to understand the PLL dynamic behavior to minimize the overall clock jitter and estimate reasonable amount for their timing budgets. For low jitter possible, the feedback path must be free from noise and its delay should be kept short.

TCITSMCN40GGPMPLLA1_guide
A phase-locked loop (PLL) constantly examines the phase difference between the reference clock and the feedback clock and adjusts the frequency of the voltage-controlled oscillator (VCO) accordingly. For example, when the reference phase drifts down, the PLL will move the output phase to track the change of the reference phase. Similarly, when some noise causes the output phase to move away from the reference phase, the PLL will try to recover the lock state as quickly as possible. However, there is a limit on how quickly the PLL can adjust its output phase. This limit is called "bandwidth" and determines how quickly the PLL can track the change in the reference phase, or equivalently, how well the PLL can reject the undesired disturbance of the output phase. High-bandwidth PLLs are good at reducing jitter on the output clock path by tracking the reference clock and low-bandwidth PLLs are good at filtering jitter on the reference clock. Deskew PLLs have high bandwidths and therefore can suppress the disturbance on the output clock path if that disturbance happens slower than the bandwidth. In the applications of deskew PLLs, the most dominant disturbances are likely to occur from the clock distribution trees, where the clock signal travels a long distance under the influence of many hostile noise sources. As a result, the clock at the end of the distribution tree will have worse jitter than the direct output clock of the PLL. If the phase drift caused by this added jitter is slow enough, then the error can be effectively canceled by the PLL. If the drift is faster than the bandwidth, however, the PLL will not be able to respond quickly enough to correct the error. Unfortunately, the jitter added by the clock tree typically has a drift rate that is much higher than the bandwidth which prevents the PLL from correcting it. Therefore, when designers budget the timing uncertainties on their clocks after the distribution chains, they must add the expected jitter from the clock distribution paths to the specified PLL jitter. This added jitter is likely to increase with the delay of the clock distribution path.

29

4.

Application Examples Basic Configuration

4.1 Example 1:

Figure 3 shows the basic application of this deskew PLL. Lets say the external clock (RCLK) frequency is 200MHz and the on-chip circuits need 400MHz clock synchronized to RCLK. To accomplish this, the buffered CLKOUT (denoted as CLKA) is fed back as FCLK and the internal clock divider of the PLL (/NF) is configured to divide by 2. Figure 3. Basic configuration of the deskew PLL.

___ | | | | ... |_^_| | | +----------------------------------------------------+ | ______________ | | +---------------+ /-->| clock tree A |-----+ ... CLKA FCLK +---->| CLKOUT |->--/ -------------(400MHz) | PLL | RCLK >---->| (NF=2,NR=OD=1)| (200MHz) +---------------+

Static timing offset between the rising edges of FCLK and RCLK seen at the PLL input is kept less than 2% of the reference clock period. However, if there is some delay on the feedback path between CLKA and FCLK, the phase difference between CLKA and RCLK will be undesirably offsetted by this amount. Likewise, if the RCLK route path on the chip is long and

TCITSMCN40GGPMPLLA1_guide
has noticeable delay, this will introduce an undesirable offset between the internal clocks and the external clock. Therefore, it is best to minimize, or at least equalize the propagation delays on RCLK and FCLK paths. The net polarity of the clock buffer tree does not affect the output phase, as long as the buffer tree is within the PLL feedback loop. The 180-degree phase shift caused by the net inversion will be compensated by the PLL adjusting its output phase. In this example, CLKA will be always rising-edge aligned with RCLK, regardless of the net polarity of the clock tree.

30

4.2 Example 2:

The External Clock Divider in the Feedback Path

For cases where an external clock divider is necessary, the PLL can be configured as in Figure 4. Since most clock dividers have delays and introduce phase offsets between their input and output clocks, this delay must be effectively canceled by inserting a delay-matched dummy divider in the RCLK path. Otherwise, the delay will cause a static phase offset between FCLK and RCLK, since the PLL aligns the clock phases seen at its inputs, A and B as denoted in Figure 4. The reasons are similar to the case of having delay on the feedback path, discussed in Section 4.1. Figure 4. The use of external clock dividers in the feedback path. ___ | | | | ... |_^_| | | +--------------------------------------------- - - --+ | | | +----+ A +---------------+ ____________ | FCLK +--| /N |-->| CLKOUT |->---| clock tree |- - -+-- .... +----+ | | -----------| PLL | +----+ B | | RCLK >--| /1 |-->| | +----+ +---------------+ ^ | +-- these two clock dividers need to be delay-matched

4.3 Example 3:

The External Clock Divider in the Forward Path

For cases where multiple clock frequencies are needed but some of them present only small loads, it may be power- and area-consuming to have identical clock buffer trees on all paths. One may wish to buffer one clock first and derive other frequencies later. External clock dividers may be used in this case as depicted in Figure 5. It is necessary to match these clock dividers in order to keep the resulting clocks aligned. Figure 5. The use of external clock dividers in the forward path. ___ | | | | ... |_^_| | | +----------------------------------------------------+ | | | +------------+ ____________ +----+ | FCLK +-->| CLKOUT |->---| clock tree |---+-->| /1 |---+-- ... CLKA | | -----------| +----+ | PLL | | +----+ | | +-->| /2 |------ ... CLKB

TCITSMCN40GGPMPLLA1_guide
RCLK >-->| | +------------+ | +----+ | +----+ +-->| /4 |------ ... CLKC +----+ ^ | these clock dividers need to be delay-matched --+

31

4.4 Example 4:

Chip-to-chip Communication

For synchronous communication across the chip boundary, the communicating chips must be synchronized in some ways to meet the setup and hold time requirements. One possible way is to synchronize all chips to a common global clock. This way, the collection of the chips will behave as if they are in the same clock domain. However, this approach can severely limit the frequency that the chips communicate each other for the following reasons. First, to minimize the skews between the clocks of different chips, the propagation delays of the board traces from the clock source to the chips must be precisely matched. The deskew PLL on each chip can synchronize only to the clock seen at its own input point and any difference between the trace delays will cause the clocks of different chips to have skew between them. The clock skew will narrow the valid timing window to meet setup and hold requirements and eventually limit the maximum frequency of operation. Second, the latency over the channel between the communicating chip will also limit the signalling rate. This latency may be even longer than the one-way time-of-flight if the channel is poorly terminated or has discontinuities (e.g. stubs on the bus). If the timing is desired to close in one cycle, the operating frequency is limited by the channel latency. Although the signal may be allowed to travel for multiple clock cycles, the precise estimation of channel latency is required to ensure reliability. Tight control over the circuit board traces usually incurs high costs. For high-frequency chip-to-chip communication, it is more viable to dedicate the separate timing information to each channel. For example, when chip A wants to send data to chip B, chip A also transmits a signal that embeds the timing information of the data. This signal is often referred as "source-synchronous clock" or "data strobe". Chip B then uses the timing information embedded in this signal to sample data at its optimal timing. The cost of the additional channel can be amortized especially when the data channel is wide (e.g. 8-bit). In case this timing signal is periodic (a "clock"), the deskew PLL can be used to recover the timing information of the received data, as illustrated in Figure 6. The deskew PLL locks to the source-synchronous clock sent by the transmitter chip and generates the clock that is phase-aligned with it. The receiving sampler triggered off the deskew PLL clock then samples the data at the center of the data eye.

Figure 6.

Deskew PLLs in the high-speed chip-to-chip interface applications.

+ - - - - - - - - - - - - - - - - from <chip A> <chip B> ---+ +----+ +----+ +--| +------------+ CLOCK | | | | | | ----------->| Deskew PLL |_________ +----+ +----+ +----+ | +-->| (/1) | | ^ | +------------+ | +-- data edge is aligned with | |______ | the clocks falling edge ___ | | __ ________ ________ ________ | | | | | DATA_A \/ DA1 \/ DA2 \/ DA3 ----->| |->A | |

TCITSMCN40GGPMPLLA1_guide
__/\________/\________/\________ |_^_| | ____________ | ^ | |______|___| clock tree |/_| +-- optimal sampling point ___ | |____________|\ __ ________ ________ ________ | | | | DATA_B \/ DB1 \/ DB2 \/ DB3 ----->| |->B | __/\________/\________/\________ |_^_| | | |______| samples at the clocks ___ | rising edge __ ________ ________ ________ | | | | DATA_C \/ DC1 \/ DC2 \/ DC3 ----->| |->C | __/\________/\________/\________ | |_^_| | |______| | + - - - - - - - - - - - - - - - - -

32

Potrebbero piacerti anche