Sei sulla pagina 1di 7

Theory Wonks:

What is the complexity of these?


Lecture 16:
Memory Caching % " " ( # +, -. +, -/

0
for(x = 0; x < SIZE_X; x++)
for(y = 0; y < SIZE_Y; y++)
sum = Array[x][y];

&0
! "#
for(y = 0; y < SIZE_Y; y++)
$%
%&''
for(x = 0; x < SIZE_X; x++)
(
)* ! " sum = Array[x][y];

Practical Demo 2 Bytes of Memory (circa 1947)

5 : ;%
3" <
+ 1 # 2 3" 1 # 2 (" # 3
" 4 " %( * )

5 7
: ;%
3 """%
%%!
* 0
2%% %%
#
5 " + + 3 % " + %
5 6 %% "( " %# 7
2 #89 # "
9
Memory (circa 2004) We have a problem!

Processor-DRAM Performance Gap (latency)

1000 CPU
µProc
60%/yr.
“Moore’s Law”

Performance
(2X/1.5yr)
100 Processor-Memory
Performance Gap:
(grows 50% / year)
10
DRAM
DRAM
9%/yr.
1 (2X/10 yrs)

1987

1989
1990

1994

1997
1981

1984
1985
1986

1988

1991
1992
1993

1995
1996

1998
1999
1980

1982
1983

2000
Time
") !

Memory Speed and Cost Hint: The Principle of Locality

5 "" % !%
#" %
% ( "
"
" # " ( )

5 1 =<'> '? %4

1 % %#
5 +
( " ( 2%
% (
"

% %#
5 +
( " ( # "2%
%
( "
Hint: Temporal and Spatial Locality One Solution: Caching

Memory
for (i=0; i<1000; i++) {
* ( " % % #8 To datapath Cache
for (j=0; j<1000; j++) {
Blk X
A[i,j] = B[i,j] + C[i,j]; +" " % % #8 From datapath
} Blk Y

}
if (errorcond) {

} ; 3 "0 5 A0 " 7@ %0B% 3.9
for (i=0; i<100; i++) { 5 ; 3 " ( " 5 A 6 0$ ( # ""(
for (j=0; j<100; j++) { ( "" " "" 5 A 1 01 "
" 7 %! "
"9
E[i,j] = D[i,j] * A[i,j];
}
5 :"
"0 7
B% 3/9
} 5 * (( " " ( @ 5 :"
"6 C >7A 6 9
#% % (( 5 :"
" %#01 % %3 D %
!
" ( 7 "
! 9 5 A 1 EE :"
" %#

Cache Management? Improving Cache Performance:


3 Options
5 %
: # #C D 7 "
"9F "
" %#
5 % *#
5 : #*"3 5 6 "
"
5 "#
" 2 A; " 7
! % #
9
5 * $ 5 6 "
" %#

5 A 2 *# 5 6 )
5 : #
5 * $

3 " " " "4


+!" % @ ( (
4 Questions for Caching Direct Mapped Cache

5 G 0; %3 % 8
7
B% 3 % 9

5 G&0A 2 " % 3( ( " 8


7B% 3 ( 9

5 G 0; % 3" % % "
"8
7
B% 3 % 9

5 G 0; " 2 8
7
; " #
9 : 0 "" " % ( % 3"
; " ! %#
8 "8

Block Placement Direct Mapped Cache: Hardware


Direct Mapped Cache
5 A2 # " "
"" 8
5 6 % '0:"" 5 + % (:
5 6 % H0:""
5 6 % &0:""
5 6 % 0A
5 6 % I0A 5 1 "
5 6 % H0A 5 L%
5 6 % &0A 5 *
5 6 % &I0:"" 5 A2 " 8
5 6 % I0:""
5 :"
" C <C J
5 K =# "
? ( "
""0
5 % """
5 (
% """
5 %" # 7 9
0 # "
""
5 1 = ?" ( """
Direct Mapped Cache: Hardware 4 Questions for Caching
Capture Spatial Locality Direct Mapped Caching
Know how to size these!
5 G 0; %3 % 8
7
B% 3 % 9

5 G&0A 2 " % 3( ( " 8


7B% 3 ( 9

5 G 0; % 3" % % "
"8
7
B% 3 % 9

5 G 0; " 2 8
7
; " #
9

Reduce Conflict Misses Four-Way Set Associative Cache

Know how to size these!


: # CA D 7 "
"9F: "
" %#

5 ! "@ % " (
% "
""
>

5 "
" ! #0 %
%2 % 3
" "! %( "

5 A %" ! % % (
%"
5 &>2 # " "
" !0 %3 " (&
( "
5 $%%# "" !0 %3 " # (
Direct Fully Associative Associativity: Hardware Cost

4 Questions for Caching Caches take up 20-40% of chip area!


Set/Fully Associative Mapped Caching
5 G 0; %3 % 8
7
B% 3 % 9

5 G&0A 2 " % 3( ( " 8


7B% 3 ( 9

5 G 0; % 3" % % "
"8
7
B% 3 % 9

5 G 0; " 2 8
7
; " #
9

Intel Pentium
Summary

5 " " " %


% (" #
5 : "
" %" # 2"

5 ; "2 "M (" 8


7 @ 9

Potrebbero piacerti anche