Sei sulla pagina 1di 38

®

Intel Core 2 Duo


Desktop Processor Architecture
What’s next?
 History
 Intel Core 2 Duo
 Intel Core 2 Microarchitecture
 Intel Core 2 Models
 Architectural Features of Core 2
 What is an instruction set?
 SSSE3 (x86)
 Execute Disable Bit
 Intel® Wide Dynamic Execution
 14 Stage pipeline
 MacroFusion
 Micro-op Fusion
 What is L1 and L2?
 Intel® Advanced Smart Cache
 Intel® Smart Memory Access
 Intel® Advanced Digital Media Boost
History
(List of Intel microprocessors)
 The 4-bit processors
4004, 4040
 The 8-bit processors
8008, 8080, 8085
 The 16-bit processors: Origin of x86
8086, 8088, 80186, 80188, 80286
 The 32-bit processors: Non x86
iAPX 432, 80960, 80860, XScale
 The 32-bit processors: The 80386 Range
80386DX, 80386SX, 80376, 80386SL, 80386EX
 The 32-bit processors: The 80486 Range
80486DX, 80486SX, 80486DX2, 80486SL, 80486DX4
 The 32-bit processors: The Pentium (“I”)
Pentium, Pentium MMX
 The 32-bit processors: P6/Pentium M
Pentium Pro, Pentium II, Celeron, Pentium III, PII and III Xeon
Celeron(PIII), Pentium M, Celeron M, Intel Core, Dual Core Xeon LV
 The 32-bit processors: NetBurst microarchitecture
Pentium 4, Xeon, Pentium 4 EE
 The 64-bit processors: IA-64
Itanium, Itanium 2
 The 64-bit processors: EM64T-NetBurst
Pentium D, Pentium Extreme Edition, Xeon
 The 64-bit processors: EM64T-Core microarchitecture
Xeon, Intel Core 2
Intel Core 2 Duo

4 / 37
Intel Core 2 Microarchitecture
Intel® Wide Woodcrest
Dynamic
Execution
Server
Intel® Optimized
Intelligent
Power
Capability Conroe

Intel®
Desktop
Advanced
Smart Cache Optimized 65nm
Intel® Smart
Memory
Merom
Access
Mobile
®
Intel Optimized
Advanced
Digital Media
Boost

5 / 37
Intel Core 2 models
 Allendale, Conroe - 65 nm process technology
 Desktop CPU
 Introduced on July 27, 2006
 Number of Transistors 291 Million on 4 MB Models
 Number of Transistors 167 Million on 2 MB Models
 Variants
 Core 2 Duo E6700 - 2.67 GHz (4 MB L2, 1066 MHz FSB)
 Core 2 Duo E6600 - 2.40 GHz (4 MB L2, 1066 MHz FSB)
 Core 2 Duo E6400 - 2.13 GHz (2 MB L2, 1066 MHz FSB)
 Core 2 Duo E6300 - 1.86 GHz (2 MB L2, 1066 MHz FSB)
 Core 2 Duo E4200 - 1.60 GHz (2 MB L2, 800 MHz FSB)

6 / 37
Intel Core 2 models
 Woodcrest - 65 nm process technology

 Server optimized CPU


 Introduced on July 26, 2006
 Same features as Conroe
 Variants
 Xeon 5160 - 3.00 GHz (4 MB L2, 1333 MHz FSB, 80 W)
 Xeon 5150 - 2.66 GHz (4 MB L2, 1333 MHz FSB, 65 W)
 Xeon 5140 - 2.33 GHz (4 MB L2, 1333 MHz FSB, 65 W)
 Xeon 5130 - 2.00 GHz (4 MB L2, 1333 MHz FSB, 65 W)
 Xeon 5120 - 1.86 GHz (4 MB L2, 1066 MHz FSB, 65 W)
 Xeon 5110 - 1.60 GHz (4 MB L2, 1066 MHz FSB, 65 W)
 Xeon 5148LV - 2.33 GHz (4 MB L2,1333 MHz FSB,40 W)

7 / 37
Intel Core 2 models
 Merom - 65 nm process technology

 Mobile CPU

 Introduced on July 27, 2006

 Same features as Conroe

 Variants
 Core 2 Duo T7600 - 2.33 GHz (4 MB L2, 667 MHz FSB)
 Core 2 Duo T7400 - 2.16 GHz (4 MB L2, 667 MHz FSB)
 Core 2 Duo T7200 - 2.00 GHz (4 MB L2, 667 MHz FSB)
 Core 2 Duo T5600 - 1.83 GHz (2 MB L2, 667 MHz FSB)
 Core 2 Duo T5500 - 1.66 GHz (2 MB L2, 667 MHz FSB)
 Core 2 Duo T5200 - 1.60 GHz (2 MB L2, 533 MHz FSB)

8 / 37
Architectural Features of Core 2
 SSSE3 SIMD instructions
 Intel Virtualization Technology, multiple OS support
 LaGrande Technology, enhanced security hardware extensions
 Execute Disable Bit
 EIST (Enhanced Intel SpeedStep Technology)
 Intel Wide Dynamic Execution
 Intel Intelligent Power Capability
 Intel Advanced Smart Cache
 Intel Smart Memory Access
 Intel Advanced Digital Media Boost

9 / 37
What is an instruction set?
 All instructions, and all their variations, that a processor
can execute
 Types:
 Arithmetic such as add and subtract
 Logic instructions such as and, or, and not
 Data instructions such as move, input, output, load, and
store
 Part of the computer architecture
 Distinguished from the microarchitecture
 Different microarchitectures can share common
instruction set while their internal designs differ
Operand
Fetch Decode Execute Retire
Fetch

10 / 37
SSSE3 (x86)
Supplemental Streaming SIMD Extension 3
 Intel's name for the SSE instruction set's fourth iteration
 Single Instruction Multiple Data instruction set
 A revision of SSE3
 CPUs with SSSE3
 Xeon 5100 series
 Intel Core 2
 Development
 Faster permutation of bytes
 Multiplying 16-bit fixed-point numbers with correct
rounding
 Better word accumulation

11 / 37
SSSE3 (x86)
Supplemental Streaming SIMD Extension 3
 16 New instructions
 PSIGNB, PSIGNW, PSIGND
 Packed Sign
 PABSB, PABSW, PABSD
 Packed Absolute Value
 PALIGNR
 Packed Align Right
 PSHUFB
 Packed Shuffle Bytes
 PMULHRSW
 Packed Multiply High with Round and Scale
 PMADDUBSW
 Multiply and Add Packed Signed and Unsigned Bytes
 PHSUBW, PHSUBD
 Packed Horizontal Subtract (Words or Doublewords)
 PHSUBSW
 Packed Horizontal Subtract and Saturate Words
 PHADDW, PHADDD
 Packed Horizontal Add (Words or Doublewords)
 PHADDSW
 Packed Horizontal Add and Saturate Words

12 / 37
Execute Disable Bit
 Problem
 Buffer overflow attacks of malicious software

 Must be combined with a supporting operating


system
 Classifies areas in memory for protection
 Disables code execution on an attack
 Decreases the need for software patches and
antivirus software

13 / 37
Intel® Wide Dynamic Execution

L
2 Performance increases
while energy
C consumption decreases

A
C
H •Advantage
E
Wider execution

Comprehensive Advancements

Enabled in each core

Each core fetches, dispatches,


executes and returns up to four
full instructions simultaneously.

14 / 37
Branch – Add – Mul – Load - Store
14 Stage pipeline
 Pentium D has 31 stage pipeline
 AMD Athlon 64 has 12 stage pipeline

 A question for the class:


 Why didn’t Intel increase the pipeline after a
31 stage experience with Pentium D?

15 / 37
14 Stage pipeline
 Pentium D has 31 stage pipeline
 AMD Athlon 64 has 12 stage pipeline

 A question for the class:


 Why didn’t Intel increase the pipeline after a
31 stage experience with Pentium D?
Bubble of non-work
Jump!
I100 I99
……………… I3 I2 I1

16 / 37
MacroFusion

 If (myVariable == myConstant)
doThis(); Compare instruction

Else
Jump instructions
doThat();

Compare + Jump = microOp

17 / 37
Micro-op Fusion
Example:
Load the contents of [mem] into a register (MOV EBX, [mem])
An ALU operation, ADD the two registers together (ADD EBX, EAX)
Store the result back to memory (MOV [mem], EBX)

 The micro-ops which are derived from the same macro-op are fused to
reduce the number of micro-ops that need to be executed.
 Gaining from the number of instruction to be executed.
 Power consumption.
 Better scheduling.
 Reduces the number of micro-ops which are handled by the out-of-
order logic.

18 / 37
What is L1 and L2?
 Level-1 and Level-2 caches
 The cache memories in a computer
 Much faster than RAM
 L1 is built on the microprocessor chip
itself.
 L2 is a seperate chip
 L2 cache is much larger than L1
cache.
19 / 37
Intel® Advanced Smart Cache

Decreased traffic

Higher cache hit rate


Reduced bus traffic
Lower latency to data

•Advantage
Increased traffic
L2 cache is shared equally

Data stored in one place

Optimizes cache resource

Up to 100% utilization of L2 cache

20 / 37
Intel® Smart Memory Access

21 / 37
Intel® Smart Memory Access

22 / 37
Intel® Smart Memory Access

23 / 37
Intel® Smart Memory Access

24 / 37
Intel® Smart Memory Access

25 / 37
Intel® Smart Memory Access

26 / 37
Intel® Smart Memory Access

27 / 37
Intel® Smart Memory Access

28 / 37
Intel® Smart Memory Access

29 / 37
Intel® Smart Memory Access

30 / 37
Intel® Smart Memory Access

31 / 37
Intel® Smart Memory Access

32 / 37
Intel® Smart Memory Access

33 / 37
Intel® Smart Memory Access
 Why?
 Lost opportunities for out-of-order execution.
 What is the idea?
 Ignore the store-load dependecies
 If there is a dependency, flash the load
instruction
 How is it checked?
 Verify by checking all dispatched store addresses
in the memory order buffer
 There is a watchdog

34 / 37
Intel® Advanced Digital Media Boost

Lower 64 bit in one cycle, upper in the next

35 / 37
Intel® Advanced Digital Media Boost

128 bit instruction completed in one cycle

36 / 37
Intel® Advanced Digital Media Boost
 Improves performance when executing SSE
instructions
 128 bit SIMD integer arithmetic
 128 bit SIMD double precision floating point
 Accelerate a broad range of applications
 Video, speech, image processing
 Encryption
 Financial
 Engineering and scientific

37 / 37
References
 [1] http://en.wikipedia.org/wiki/List_of_Intel_microprocessors

 [2] http://en.wikipedia.org/wiki/SSSE3

 [3] http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748

 [4] http://en.wikipedia.org/wiki/Instruction_set

 [5] http://download.intel.com/technology/architecture/new_architecture_06.pdf

 [6] http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748&p=3

 [7] http://searchsmb.techtarget.com/sDefinition/0,,sid44_gci212451,00.html

 [8]
http://www.intel.com/cd/products/services/emea/tur/processors/287176.htm

 [9] http://techreport.com/reviews/2006q3/core2/index.x?pg=1

Potrebbero piacerti anche