Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
4 / 37
Intel Core 2 Microarchitecture
Intel® Wide Woodcrest
Dynamic
Execution
Server
Intel® Optimized
Intelligent
Power
Capability Conroe
Intel®
Desktop
Advanced
Smart Cache Optimized 65nm
Intel® Smart
Memory
Merom
Access
Mobile
®
Intel Optimized
Advanced
Digital Media
Boost
5 / 37
Intel Core 2 models
Allendale, Conroe - 65 nm process technology
Desktop CPU
Introduced on July 27, 2006
Number of Transistors 291 Million on 4 MB Models
Number of Transistors 167 Million on 2 MB Models
Variants
Core 2 Duo E6700 - 2.67 GHz (4 MB L2, 1066 MHz FSB)
Core 2 Duo E6600 - 2.40 GHz (4 MB L2, 1066 MHz FSB)
Core 2 Duo E6400 - 2.13 GHz (2 MB L2, 1066 MHz FSB)
Core 2 Duo E6300 - 1.86 GHz (2 MB L2, 1066 MHz FSB)
Core 2 Duo E4200 - 1.60 GHz (2 MB L2, 800 MHz FSB)
6 / 37
Intel Core 2 models
Woodcrest - 65 nm process technology
7 / 37
Intel Core 2 models
Merom - 65 nm process technology
Mobile CPU
Variants
Core 2 Duo T7600 - 2.33 GHz (4 MB L2, 667 MHz FSB)
Core 2 Duo T7400 - 2.16 GHz (4 MB L2, 667 MHz FSB)
Core 2 Duo T7200 - 2.00 GHz (4 MB L2, 667 MHz FSB)
Core 2 Duo T5600 - 1.83 GHz (2 MB L2, 667 MHz FSB)
Core 2 Duo T5500 - 1.66 GHz (2 MB L2, 667 MHz FSB)
Core 2 Duo T5200 - 1.60 GHz (2 MB L2, 533 MHz FSB)
8 / 37
Architectural Features of Core 2
SSSE3 SIMD instructions
Intel Virtualization Technology, multiple OS support
LaGrande Technology, enhanced security hardware extensions
Execute Disable Bit
EIST (Enhanced Intel SpeedStep Technology)
Intel Wide Dynamic Execution
Intel Intelligent Power Capability
Intel Advanced Smart Cache
Intel Smart Memory Access
Intel Advanced Digital Media Boost
9 / 37
What is an instruction set?
All instructions, and all their variations, that a processor
can execute
Types:
Arithmetic such as add and subtract
Logic instructions such as and, or, and not
Data instructions such as move, input, output, load, and
store
Part of the computer architecture
Distinguished from the microarchitecture
Different microarchitectures can share common
instruction set while their internal designs differ
Operand
Fetch Decode Execute Retire
Fetch
10 / 37
SSSE3 (x86)
Supplemental Streaming SIMD Extension 3
Intel's name for the SSE instruction set's fourth iteration
Single Instruction Multiple Data instruction set
A revision of SSE3
CPUs with SSSE3
Xeon 5100 series
Intel Core 2
Development
Faster permutation of bytes
Multiplying 16-bit fixed-point numbers with correct
rounding
Better word accumulation
11 / 37
SSSE3 (x86)
Supplemental Streaming SIMD Extension 3
16 New instructions
PSIGNB, PSIGNW, PSIGND
Packed Sign
PABSB, PABSW, PABSD
Packed Absolute Value
PALIGNR
Packed Align Right
PSHUFB
Packed Shuffle Bytes
PMULHRSW
Packed Multiply High with Round and Scale
PMADDUBSW
Multiply and Add Packed Signed and Unsigned Bytes
PHSUBW, PHSUBD
Packed Horizontal Subtract (Words or Doublewords)
PHSUBSW
Packed Horizontal Subtract and Saturate Words
PHADDW, PHADDD
Packed Horizontal Add (Words or Doublewords)
PHADDSW
Packed Horizontal Add and Saturate Words
12 / 37
Execute Disable Bit
Problem
Buffer overflow attacks of malicious software
13 / 37
Intel® Wide Dynamic Execution
L
2 Performance increases
while energy
C consumption decreases
A
C
H •Advantage
E
Wider execution
Comprehensive Advancements
14 / 37
Branch – Add – Mul – Load - Store
14 Stage pipeline
Pentium D has 31 stage pipeline
AMD Athlon 64 has 12 stage pipeline
15 / 37
14 Stage pipeline
Pentium D has 31 stage pipeline
AMD Athlon 64 has 12 stage pipeline
16 / 37
MacroFusion
If (myVariable == myConstant)
doThis(); Compare instruction
Else
Jump instructions
doThat();
17 / 37
Micro-op Fusion
Example:
Load the contents of [mem] into a register (MOV EBX, [mem])
An ALU operation, ADD the two registers together (ADD EBX, EAX)
Store the result back to memory (MOV [mem], EBX)
The micro-ops which are derived from the same macro-op are fused to
reduce the number of micro-ops that need to be executed.
Gaining from the number of instruction to be executed.
Power consumption.
Better scheduling.
Reduces the number of micro-ops which are handled by the out-of-
order logic.
18 / 37
What is L1 and L2?
Level-1 and Level-2 caches
The cache memories in a computer
Much faster than RAM
L1 is built on the microprocessor chip
itself.
L2 is a seperate chip
L2 cache is much larger than L1
cache.
19 / 37
Intel® Advanced Smart Cache
Decreased traffic
•Advantage
Increased traffic
L2 cache is shared equally
20 / 37
Intel® Smart Memory Access
21 / 37
Intel® Smart Memory Access
22 / 37
Intel® Smart Memory Access
23 / 37
Intel® Smart Memory Access
24 / 37
Intel® Smart Memory Access
25 / 37
Intel® Smart Memory Access
26 / 37
Intel® Smart Memory Access
27 / 37
Intel® Smart Memory Access
28 / 37
Intel® Smart Memory Access
29 / 37
Intel® Smart Memory Access
30 / 37
Intel® Smart Memory Access
31 / 37
Intel® Smart Memory Access
32 / 37
Intel® Smart Memory Access
33 / 37
Intel® Smart Memory Access
Why?
Lost opportunities for out-of-order execution.
What is the idea?
Ignore the store-load dependecies
If there is a dependency, flash the load
instruction
How is it checked?
Verify by checking all dispatched store addresses
in the memory order buffer
There is a watchdog
34 / 37
Intel® Advanced Digital Media Boost
35 / 37
Intel® Advanced Digital Media Boost
36 / 37
Intel® Advanced Digital Media Boost
Improves performance when executing SSE
instructions
128 bit SIMD integer arithmetic
128 bit SIMD double precision floating point
Accelerate a broad range of applications
Video, speech, image processing
Encryption
Financial
Engineering and scientific
37 / 37
References
[1] http://en.wikipedia.org/wiki/List_of_Intel_microprocessors
[2] http://en.wikipedia.org/wiki/SSSE3
[3] http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748
[4] http://en.wikipedia.org/wiki/Instruction_set
[5] http://download.intel.com/technology/architecture/new_architecture_06.pdf
[6] http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748&p=3
[7] http://searchsmb.techtarget.com/sDefinition/0,,sid44_gci212451,00.html
[8]
http://www.intel.com/cd/products/services/emea/tur/processors/287176.htm
[9] http://techreport.com/reviews/2006q3/core2/index.x?pg=1