Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Proc 60%/yr. Moores Law (2X/1.5yr) Processor-Memory Performance Gap: (grows 50% / year) DRAM DRAM 9%/yr. (2X/10 yrs)
CPU
1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Time
Impact on Performance
Inst M iss (0.5) 16% Idea l CPI (1.1) 35%
Suppose that 10% of memory Da taM iss operations get 50 cycle (1.6) 49% miss penalty CPI = ideal CPI + average stalls per instruction = 1.1(cyc) +( 0.30 (datamops/ins) x 0.10 (miss/datamop) x 50 (cycle/miss) ) = 1.1 cycle + 1.5 cycle = 2. 6 58 % of the time the processor is stalled waiting for memory! a 1% instruction miss rate would add an additional 0.5 cycles to the CPI!
WrEn
A0
A1 A2 A3
Address Decoder
SRAM Cell
Word 1
SRAM Cell
:
SRAM Cell - Sense Amp +
:
SRAM Cell - Sense Amp +
:
SRAM Cell - Sense Amp +
:
Word 15
Dout 3
Dout 2
Dout 1
Dout 0
WE_L OE_L
row address
Column Address
data
256K x 8 DRAM
Control Signals (RAS_L, CAS_L, WE_L, OE_L) are all active low Din and Dout are combined (D):
WE_L is asserted (Low), OE_L is disasserted (High) D serves as the data input pin WE_L is disasserted (High), OE_L is asserted (Low) D is the data output pin
Wide:
Simple:
CPU/Mux 1 word; Mux/Cache, Bus, Memory N words (Alpha: 64 bits & 256 bits)
Interleaved:
CPU, Cache, Bus 1 word: Memory N Modules (4 Modules); example is word interleaved
D1 available Start Access for D1 Start Access for D2 Memory Bank 0 Memory Bank 1 Memory Bank 2 Access Bank 0 Memory Bank 3 Access Bank 1 Access Bank 2 Access Bank 3 We can Access Bank 0 again
CPU
Bank 0
Bank 1
Bank 2
Bank 3
Intervleaved memory
Send address to several banks simultaneously Want Number of banks > Number of clocks to access bank As the size of memory chips increases, it becomes difficult to interleave memory Difficult to expand memory by small amounts May not work well for non-sequential accesses
Even with 128 banks their are conflicts, since 512 is an even multiple of 128 SW: loop interchange or declaring array not power of 2 HW: Prime number of banks
bank number = address mod number of banks address within bank = address / number of banks modulo & divide per memory access? address within bank = address mod number words in bank (3, 7, 31)
N rows
N rows x N column x M-bit Read & Write M-bit at a time Each M-bit access requires a RAS / CAS cycle
DRAM
Row Address
M bits M-bit Output 1st M-bit Access RAS_L CAS_L A Row Address Col Address Junk Row Address Col Address Junk 2nd M-bit Access
DRAM
Row Address
2nd M-bit
3rd M-bit
4th M-bit
Col Address
Col Address
Col Address