Sei sulla pagina 1di 28

Chapter 2 Virtex Architecture

Basic Virtex Architecture 2-1

Objectives
After completing this module, you will be able to:
w Recognize the basic architectural resources of the Virtex FPGA

Basic Virtex Architecture 2 -2

Outline
w w w w w CLB Resources Select I/O Memory Resources Global Routing Resources Summary

Basic Virtex Architecture 2 -3

The FPGA Die


I/O Blocks (IOBs)

CLB

CLB

Switch Matrix

CLB

CLB

Configurable Logic Blocks (CLBs)

Programmable Interconnect

Basic Virtex Architecture 2 -4

The CLB Tile


w CLB Tile is composed of: Switch matrix Configurable Logic Block and associated general routing resources IMUX and OMUX
CARRY SINGLE LONG HEX TRISTATE BUSSES CARRY LONG HEX SINGLE

LONG HEX

SINGLE

SWITCH MATRIX

w All CLB inputs have access to interconnect on all 4 sides w Fast local feedback within the CLB & direct connects to east and west CLBs enable the creation of wide functions of up to 19 inputs within a single CLB
Basic Virtex Architecture 2 -5
DIRECT CONNECT
SINGLE LONG HEX

SLICE
Local Feedback

SLICE
DIRECT CONNECT

CLB
CARRY CARRY

Simplified CLB Structure


CLB

Slice 0 LUT Carry


PRE D Q CE CLR

Slice 1 LUT Carry


PRE D Q CE CLR

LUT

Carry

PRE D Q CE CLR

LUT

Carry

PRE D Q CE CLR

Two slices in each CLB


Two BUFTs associated with each CLB, accessible by all 8 CLB outputs Carry Logic runs vertically, up only

Basic Virtex Architecture 2 -6

Look-Up Tables
w Combinatorial logic is stored in Look-Up Tables (LUTs) in a CLB w Capacity is limited by number of inputs, not complexity A B C D Z w Delay through CLB is constant
Combinatorial Logic

A B C D

0 0 0 0 0 0 1 1 1 1

0 0 0 0 1 1 1 1 1 1

0 0 1 1 0 0 0 0 1 1

0 1 0 1 0 1 0 1 0 1

0 0 0 1 1 1 0 0 0 1

. . .

Basic Virtex Architecture 2 -7

Connecting Function Generators


w Some functions need several function generators
F5 MUXs connect pairs of function generators
Any function of 5 inputs or some up to 9 inputs
Fnct Gen

Slice 0

Fnct Gen

Slice 1

F5

F5

F6

Fnct Gen

Fnct Gen

F6 MUXs connect all 4 function generators


Any function of 6 inputs or some up to 19 inputs

Basic Virtex Architecture 2 -8

Flexible Sequential Elements


w Can be flip-flops or latches w Two in each slice, four in each CLB w Inputs come from LUTs or an independent CLB input w Separate set and reset controls
Can be synchronous or asynchronous
FDRSE D CE R S Q

FDCPE D PRE CE CLR Q

LDCPE D PRE CE G CLR Q

w All controls are shared within a slice but can be inverted locally

Basic Virtex Architecture 2 -9

Shift Register LUT


w Dynamically addressable shift register (SRL)
Ultra-efficient programmable delay for balancing pipelined designs Maximum delay of 16 clock cycles Can be read asynchronously by toggling address lines
LUT IN CE CLK
D Q CE

D Q CE

D Q CE

OUT

CLB Slice LUT Slice LUT D Q CE

LUT

LUT

DEPTH[3:0]

Basic Virtex Architecture 2 -10

Fast Carry Logic


w Simple, fast & complete Arithmetic Logic
Dedicated XOR gate for single level sum completion Uses dedicated routing resources with a delay of ~0.1ns All synthesis tools can infer carry logic
Carry Out

Sum

Basic Virtex Architecture 2 -11

Dedicated Multiplier Resource


LUT

S DI

CO

CY_MUX
CI

CY_XOR MULT_AND

AxB

LUT

LUT

w Highly efficient Multiply & Add implementation


Virtex enables a 30% reduction in area by producing the Multiply and Add in same logic cell

Basic Virtex Architecture 2 -12

Outline
w w w w w CLB Resources Select I/O Memory Resources Global Routing Resources Summary

Basic Virtex Architecture 2 -13

Simplified IOB Structure


w Fast I/O drivers w Separate registers for input, output & three-state control
Async/Sync set or reset Common clock and separate clock enables improves usability Configure as FF or latch
DFF/LATCH D CE S/R Q

DFF/LATCH D CE S/R Q

PAD

w Programmable slew rate and variable input delay w Selectable I/O standard support

DFF/LATCH D CE S/R Q

Basic Virtex Architecture 2 -14

Select I/O
w Select I/O allows connection directly to external signals of varied voltages & thresholds
Processors, memory, bus specific standards, mixed signal... Provides industry standard IEEE/JDEC I/O standards Optimizes the speed/noise tradeoff Saves having to place interface components onto your board

w Define I/O by assigning the standard in the Constraints Editor

Basic Virtex Architecture 2 -15

Currently Supported I/O Standards


w w w w w w w w w w LVTTL LVCMOS2 PCI w Note that LVTTL, LVCMOS2, GTL and PCI are the only standards that are 5.0V input tolerant GTL+ HSTL (class 1, 3, and 4) SSTL3 (class 1 and 2) SSTL2 (class 1 and 2) CTT AGP

Basic Virtex Architecture 2 -16

Outline
w w w w w CLB Resources Select I/O Memory Resources Global Routing Resources Summary

Basic Virtex Architecture 2 -17

Data Storage Hierarchy


w Virtex supports 3 levels of memory hierarchy w On-chip Distributed SelectRAM+
Small-to-medium memories 0.6-ns read access time

w On-chip Block SelectRAM+


Larger memories True dual-ported operation 3.3-ns read access time

w Fast SelectI/O interfaces to external RAM


Fast clock-to-output times enable DLL to boost memory bandwidth

Basic Virtex Architecture 2 -18

Distributed SelectRAM+
w Same RAM/ROM as XC4000

Synchronous write Asynchronous read


w Accompanying flip-flops can be used to create a synchronous read w RAM/ROM are initialized during configuration w Data can be written to RAM after configuration w Emulated dual-port RAM can read and write to different addresses simultaneously
Basic Virtex Architecture 2 -19

LUT

RAM16X1S D WE WCLK A0 O A1 A2 A3

Slice LUT

RAM32X1S D WE WCLK A0 O A1 A2 A3 A4

RAM16X1D D WE WCLK A0 SPO A1 A2 A3 DPRA0 DPO DPRA1 DPRA2 DPRA3

LUT

Block SelectRAM+
w Up to 32 dual-ported 4096-bit RAM blocks Synchronous read and write Located on left & right sides with one block every 4 rows 8 blocks in the XCV50 - 32Kb 32 blocks in the XCV1000 - 128Kb w True dual-port memory Each port has synchronous read and write capability Different clocks for each port w Synchronous Reset & INIT Values State machines, decodes, serial to parallel conversion, etc.
RAMB4_S#_S#
WEA ENA RSTA CLKA ADDRA[#:0] DIA[#:0]

DOA[#:0]

WEB ENB RSTB CLKB ADDRB[#:0] DIB[#:0]

DOB[#:0]

Allowed Widths
ADDR (11:0) (10:0) (9:0) (8:0) (7:0) DATA (0:0) (1:0) (3:0) (7:0) (15:0) #/Width 1 2 4 8 16 Depth 4096 2048 1024 512 256

Basic Virtex Architecture 2 -20

Outline
w w w w w CLB Resources Select I/O Memory Resources Global Routing Resources Summary

Basic Virtex Architecture 2 -21

Global Clock Routing Resources


w Four dedicated global low skew buffers
Dedicated input pin - intended to distribute clocks only Two on the middle-top, two on the middle-bottom

w 24 additional shared resources


These resources are associated with the IOB and are used by default These resources are intended to distribute low skew/high fanout control signals such as additional clocks, clock enables, three-state controls, and resets Distribute control signals across the device in less than 10ns

Basic Virtex Architecture 2 -22

DLLs Maximize I/O Speed


w Clock distribution causes the outputs to have less time to go off chip w Traditional solution
Use highly buffered, balanced clock trees
Needed to reduce internal clock skew Cannot totally eliminate the delay

w The Virtex solution


Use a Delay Lock Loop (DLL) Phase shifts the internal and external clocks Effectively eliminates the clock-distribution delay Shortens signal output time

Basic Virtex Architecture 2 -23

Virtex has 4 Independent DLLs


Clock
Error Comparator Delay

Fdbk Clock

CLB

Data

IOB

w DLLs adjust clock delay to align internal and external clocks Digital closed-loop control 25 to 200-MHz range, 35-picosecond resolution
Basic Virtex Architecture 2 -24

Clock Delay Removal Example


w The most common use is for clock delay removal. This circuit removes all delay from the clock pin to the clock ports of the synchronous elements
BUFGDLL IBUFG BUFG
CLKDLL
CLKIN CLKFB CLK0 CLK90 CLK180 CLK270 CLK2X CLKDV LOCKED

RST

w This circuit compares CLKIN with CLKFB, and phase shifts the output clock so the signal arrives at all clock ports at the same time
Basic Virtex Architecture 2 -25

DLL Division
w Selectable Division Values
1.5, 2, 2.5, 3, 4, 5, 8, or 16 Input 50/50 Duty Cycle Correction Available Use DLL Pair to Combine Functions 180

2X 30 MHz - 180 Phase Shift DV2

DLL
30 MHz 30 MHz used for feedback 30 MHz (180 Shift)

30 MHz (180 Shift)

15 MHz (Divide by 2)

DLL
60 MHz (Multiply by 2)

30 MHz 180 Phase Shift - Clock Multiply & Clock Basic Virtex Architecture Divide 2 -26

The Virtex Family


XCV50 System Gates Logic Cells Block RAM User I/O CS144 TQ144 PQ/HQ240 BG256 BG352 BG432 BG560 FG256 FG456 FG600 FG680 94 94 164 180 94 94 164 180 260 260 260 316 316 404 316 404 316 404 404 164 164 164 164 164 164 57,906 1,758 32 Kb XCV100 108,904 2,700 40 Kb XCV150 164,674 3,888 48 Kb XCV200 236,666 5,292 56 Kb XCV300 322,970 6,912 64 Kb XCV400 468,252 10,800 80 Kb XCV600 661,111 15,552 96 Kb XCV800 888,439 21,168 112 Kb XCV1000 1,124,022 27,648 128 Kb

176

176

176 260

176 284

312 404 404 500 404 514 514

The complete Virtex Data Sheet is on your AppLinx CD-ROM and at www.xilinx.com/partinfo/virtex.pdf
Basic Virtex Architecture 2 -27

Summary
w FPGAs are made primarily of LUTs and registers contained in CLBs w The F5 and F6 Muxes in the Virtex CLB enables the creation of high fan-in functions (up to 19 inputs) with minimal delay w The Carry Logic resources create very fast and efficient arithmetic functions w IOBs contain a variety of resources and enables direct connection to multiple I/O standards w Virtex contains distributed and block RAM for a variety of applications w The DLL eliminates internal clock and board level clock distribution delay
Basic Virtex Architecture 2 -28

Potrebbero piacerti anche