CH03 COA10e - Top Level PDF

+
William Stallings
Computer Organization
and Architecture
10th Edition
© 2016 Pearson Education, Inc., Hoboken,
NJ. All rights reserved.
+ Chapter 3
A Top-Level View of Computer
Function and Interconnection
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Computer Components
 Contemporary computer designs are based on concepts

developed by John von Neumann at the Institute for
Advanced Studies, Princeton
 Referred to as the von Neumann architecture and is based on

three key concepts:
 Data and instructions are stored in a single read-write memory
 The contents of this memory are addressable by location, without
regard to the type of data contained there
 Execution occurs in a sequential fashion (unless explicitly
modified) from one instruction to the next
 Hardwired program
 The result of the process of connecting the various components in
the desired configuration
+ Data
Sequence of
arithmetic
and logic
functions
Results
(a) Programming in hardware
Hardware
and Software Instruction Instruction
Approaches
codes interpreter
Control
signals
General-purpose
Data arithmetic Results
and logic
functions
(b) Programming in software
Figure 3.1 Hardware and Software Approaches

Software
• A sequence of codes or instructions
• Part of the hardware interprets each instruction and
Software
generates control signals
• Provide a new sequence of codes for each new
program instead of rewiring the hardware
Major components:
• CPU I/O
• Instruction interpreter Components
• Module of general-purpose arithmetic and logic
functions
• I/O Components
• Input module
+ • Contains basic components for accepting data
and instructions and converting them into an
internal form of signals usable by the system
• Output module
• Means of reporting results

Memory Memory buffer
address register (MBR) MEMORY
register (MAR) • Contains the data
• Specifies the to be written into
address in memory memory or
for the next read or receives the data
write read from memory
MAR
I/O address I/O buffer
register (I/OAR) register (I/OBR)
• Specifies a • Used for the
particular I/O exchange of data
+ device between an I/O
module and the
CPU
MBR

CPU Main Memory
0
System 1
2
PC MAR Bus
Instruction
Instruction
Instruction
IR MBR
I/O AR
Data
Execution
unit Data
I/O BR
Data
Data
I/O Module n–2

n–1
PC = Program counter
Buffers IR = Instruction register
MAR = Memory address register
MBR = Memory buffer register
I/O AR = Input/output address register
I/O BR = Input/output buffer register
Figure 3.2 Computer Components: Top-Level View

Fetch Cycle Execute Cycle
Fetch Next Execute

START HALT
Instruction Instruction
Figure 3.3 Basic Instruction Cycle

+
Fetch Cycle
 At the beginning of each instruction cycle the processor
fetches an instruction from memory
 The program counter (PC) holds the address of the

instruction to be fetched next
 The processor increments the PC after each instruction

fetch so that it will fetch the next instruction in sequence
 The fetched instruction is loaded into the instruction

register (IR)
 The processor interprets the instruction and performs the

required action

Action Categories
• Data transferred from • Data transferred to or
processor to memory from a peripheral
or from memory to device by
processor transferring between
the processor and an
I/O module
Processor- Processor-
memory I/O
Data
Control
processing
• An instruction may • The processor may

specify that the perform some
sequence of arithmetic or logic
execution be altered operation on data

0 3 4 15
Opcode Address
(a) Instruction format
0 1 15
S Magnitude
(b) Integer format
Program Counter (PC) = Address of instruction

Instruction Register (IR) = Instruction being executed
Accumulator (AC) = Temporary storage
(c) Internal CPU registers
0001 = Load AC from Memory

0010 = Store AC to Memory
0101 = Add to AC from Memory
(d) Partial list of opcodes
Figure 3.4 Characteristics of a Hypothetical Machine

Memory CPU Registers Memory CPU Registers
300 1 9 4 0 3 0 0 PC 300 1 9 4 0 3 0 1 PC
301 5 9 4 1 AC 301 5 9 4 1 0 0 0 3 AC
302 2 9 4 1 1 9 4 0 IR 302 2 9 4 1 1 9 4 0 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 2
Step 1 Step 2
300 1 9 4 0 3 0 1 PC 300 1 9 4 0 3 0 2 PC
301 5 9 4 1 0 0 0 3 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 5 9 4 1 IR 302 2 9 4 1 5 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3 3+2=5
941 0 0 0 2 941 0 0 0 2
Step 3 Step 4
300 1 9 4 0 3 0 2 PC 300 1 9 4 0 3 0 3 PC
301 5 9 4 1 0 0 0 5 AC 301 5 9 4 1 0 0 0 5 AC
302 2 9 4 1 2 9 4 1 IR 302 2 9 4 1 2 9 4 1 IR
• •
• •
940 0 0 0 3 940 0 0 0 3
941 0 0 0 2 941 0 0 0 5
Step 5 Step 6
Figure 3.5 Example of Program Execution

(contents of memory and registers in hexadecimal)
Instruction Operand Operand
fetch fetch store
Multiple Multiple
operands results
Instruction Instruction Operand Operand

Data
address operation address address
Operation
calculation decoding calculation calculation
Return for string

Instruction complete, or vector data
fetch next instruction
Figure 3.6 Instruction Cycle State Diagram

Program Generated by some condition that occurs as a result of an instruction
execution, such as arithmetic overflow, division by zero, attempt to
execute an illegal machine instruction, or reference outside a user's
allowed memory space.
Timer Generated by a timer within the processor. This allows the operating
system to perform certain functions on a regular basis.
I/O Generated by an I/O controller, to signal normal completion of an
operation, request service from the processor, or to signal a variety of
error conditions.
Hardware failure Generated by a failure such as power failure or memory parity error.
Table 3.1
Classes of Interrupts

User I/O User I/O User I/O
Program Program Program Program Program Program
1 4 1 4 1 4
I/O I/O I/O

Command Command Command
WRITE WRITE WRITE
5
2a
END
2 2
Interrupt Interrupt
2b Handler Handler
WRITE WRITE 5 WRITE 5
END END
3a
3 3
3b
WRITE WRITE WRITE
(a) No interrupts (b) Interrupts; short I/O wait (c) Interrupts; long I/O wait
= interrupt occurs during course of execution of user program
Figure 3.7 Program Flow of Control Without and With Interrupts

User Program Interrupt Handler
i
Interrupt
occurs here i+1
Figure 3.8 Transfer of Control via Interrupts

Fetch Cycle Execute Cycle Interrupt Cycle
Interrupts
Disabled
Check for
Fetch Next Execute
START Interrupt;
Instruction Instruction Interrupts Process Interrupt
Enabled
HALT
Figure 3.9 Instruction Cycle with Interrupts

Time
1 1
4 4
I/O operation
I/O operation;
processor waits 2a concurrent with
processor executing
5 5
2b
2
4
I/O operation
4 3a concurrent with
processor executing
I/O operation;
processor waits 5
5 3b
(b) With interrupts

3
(a) Without interrupts
Figure 3.10 Program Timing: Short I/O Wait

Time
1 1
4 4
I/O operation; 2 I/O operation

processor waits concurrent with
processor executing;
then processor
waits
5
5
2
4
4
3 I/O operation
concurrent with
I/O operation; processor executing;
processor waits then processor
waits
5
5
3 (b) With interrupts
(a) Without interrupts
Figure 3.11 Program Timing: Long I/O Wait

Instruction Operand Operand
fetch fetch store
Multiple Multiple
operands results
Instruction Instruction Operand Operand

Data Interrupt
address operation address address Interrupt
Operation check
calculation decoding calculation calculation
No
Instruction complete, Return for string interrupt
fetch next instruction or vector data
Figure 3.12 Instruction Cycle State Diagram, With Interrupts

Interrupt
User program handler X
Interrupt
handler Y
(a) Sequential interrupt processing
Interrupt
User program handler X
Interrupt
handler Y
(b) Nested interrupt processing
Figure 3.13 Transfer of Control with Multiple Interrupts

Printer Communication
User program
interrupt service routine interrupt service routine
t=0
15
0 t=
t =1
t = 25
t= t = 25 Disk
40 interrupt service routine
t=
35
Figure 3.14 Example Time Sequence of Multiple Interrupts

+
I/O Function
 I/O module can exchange data directly with the processor
 Processor can read data from or write data to an I/O module

 Processor identifies a specific device that is controlled by a
particular I/O module
 I/O instructions rather than memory referencing instructions
 In some cases it is desirable to allow I/O exchanges to occur

directly with memory
 The processor grants to an I/O module the authority to read from
or write to memory so that the I/O memory transfer can occur
without tying up the processor
 The I/O module issues read or write commands to memory
relieving the processor of responsibility for the exchange
 This operation is known as direct memory access (DMA)

Read Memory
Write
N Words
Address 0 Data
Data N–1
Read I/O Module Internal

Write Data
External
Address M Ports Data
Internal
Data Interrupt
Signals
External
Data
Instructions Address
Control
Data CPU Signals
Interrupt Data
Signals
Figure 3.15 Computer Modules

The interconnection structure must support the
following types of transfers:
Memory Processor I/O to or

I/O to Processor
to to from
processor to I/O
processor memory memory
An I/O
module is
allowed to
exchange
data
Processor Processor
directly
reads an Processor reads data Processor
with
instruction writes a from an I/O sends data
memory
or a unit of unit of data device via to the I/O
without
data from to memory an I/O device
going
memory module
through the
processor
using direct
memory
access

A communication pathway Signals transmitted by any
connecting two or more one device are available for
devices reception by all other
devices attached to the bus
I
• Key characteristic is that it is a
shared transmission medium • If two devices transmit during the
n
same time period their signals will
overlap and become garbled
n
e
Typically consists of multiple
Computer systems contain a t
B c
communication lines
number of different buses
• Each line is capable of that provide pathways
transmitting signals representing
binary 1 and binary 0 between components at e
u t
various levels of the
computer system hierarchy
r
s i
System bus c
• A bus that connects major The most common computer o
o
computer components (processor,
memory, I/O) interconnection structures
are based on the use of one
or more system buses
n
n
Data Bus
 Data lines that provide a path for moving data among system
modules
 May consist of 32, 64, 128, or more separate lines
 The number of lines is referred to as the width of the data bus
 The number of lines determines how many bits can be

transferred at a time
 The width of the data bus

is a key factor in
determining overall
system performance

+ Address Bus Control Bus
 Used to designate the source or

destination of the data on the  Used to control the access and the
data bus use of the data and address lines
 If the processor wishes to
read a word of data from  Because the data and address lines
memory it puts the address of are shared by all components there
the desired word on the must be a means of controlling their
use
address lines
 Control signals transmit both
 Width determines the maximum command and timing information
possible memory capacity of the among system modules
system
 Timing signals indicate the validity
 Also used to address I/O ports of data and address information
 The higher order bits are
used to select a particular  Command signals specify operations
module on the bus and the to be performed
lower order bits select a
memory location or I/O port
within the module
CPU Memory Memory I/O I/O
Control lines
Address lines Bus
Data lines
Figure 3.16 Bus Interconnection Scheme

+
Point-to-Point Interconnect
Principal reason for change At higher and higher data
was the electrical rates it becomes
constraints encountered increasingly difficult to
with increasing the perform the synchronization
frequency of wide and arbitration functions in a
synchronous buses timely fashion
A conventional shared bus

on the same chip magnified
Has lower latency, higher
the difficulties of increasing
data rate, and better
bus data rate and reducing
scalability
bus latency to keep up with
the processors

+Quick Path Interconnect
QPI
 Introduced in 2008
 Multiple direct connections

 Direct pairwise connections to other components
eliminating the need for arbitration found in shared
transmission systems
 Layered protocol architecture
 These processor level interconnects use a layered

protocol architecture rather than the simple use of
control signals found in shared bus arrangements
 Packetized data transfer
 Data are sent as a sequence of packets each of which

includes control headers and error control codes

I/O device
I/O device
I/O Hub
DRAM
DRAM
Core Core
A B
DRAM
DRAM
Core Core
C D
I/O device
I/O device
I/O Hub
QPI PCI Express Memory bus
Figure 3.17 Multicore Configuration Using QPI

Packets
Protocol Protocol
Routing Routing
Flits
Link Link
Physical Phits Physical
Figure 3.18 QPI Layers

COMPONENT A
Intel QuickPath Interconnect Port
Fwd Clk
Rcv Clk
Transmission Lanes Reception Lanes
Fwd Clk
Rcv Clk
Reception Lanes Transmission Lanes
Intel QuickPath Interconnect Port

COMPONENT B
Figure 3.19 Physical Interface of the Intel QPI Interconnect

#2n+1 #n+1 #1 QPI
lane 0
bit stream of flits #2n+2 #n+2 #2 QPI

lane 1
#2n+1 #2n #n+2 #n+1 #n #2 #1
#3n #2n #n QPI

lane 19
Figure 3.20 QPI Multilane Distribution

+
QPI Link Layer
 Flow control function

 Performs two key  Needed to ensure that a
functions: flow control and sending QPI entity does not
error control overwhelm a receiving QPI
entity by sending data faster
 Operate on the level of than the receiver can process
the flit (flow control the data and clear buffers for
unit) more incoming data
 Each flit consists of a 72-
bit message payload
 Error control function
and an 8-bit error
control code called a  Detects and recovers from
cyclic redundancy check bit errors, and so isolates
(CRC) higher layers from
experiencing bit errors

+
QPI Routing and Protocol Layers
Routing Layer Protocol Layer

 Packet is defined as the unit of
 Used to determine the course transfer
that a packet will traverse
across the available system  One key function performed at
interconnects this level is a cache coherency
protocol which deals with
 Defined by firmware and making sure that main
describe the possible paths memory values held in
that a packet can follow multiple caches are consistent
 A typical data packet payload

is a block of data being sent to
or from a cache

+
Peripheral Component
Interconnect (PCI)
 A popular high bandwidth, processor independent bus that can
function as a mezzanine or peripheral bus
 Delivers better system performance for high speed I/O

subsystems
 PCI Special Interest Group (SIG)

 Created to develop further and maintain the compatibility of the PCI
specifications
 PCI Express (PCIe)

 Point-to-point interconnect scheme intended to replace bus-based
schemes such as PCI
 Key requirement is high capacity to support the needs of higher data rate
I/O devices, such as Gigabit Ethernet
 Another requirement deals with the need to support time dependent data
streams
Core Core
Gigabit PCIe
Memory
Ethernet
Chipset
PCIe–PCI PCIe
Memory
Bridge
PCIe
PCIe PCIe
Switch
PCIe PCIe
Legacy PCIe PCIe PCIe

endpoint endpoint endpoint endpoint
Figure 3.21 Typical Configuration Using PCIe

Transaction layer
packets (TLP)
Transaction Transaction
Data link layer

packets (DLLP)
Data Link Data Link
Physical Physical
Figure 3.22 PCIe Protocol Layers

B4 B0 128b/ PCIe
130b lane 0
byte stream
B5 B1 128b/ PCIe
130b lane 1
B7 B6 B5 B4 B3 B2 B1 B0
B6 B2 128b/ PCIe
130b lane 2
B7 B3 128b/ PCIe
130b lane 3
Figure 3.23 PCIe Multilane Distribution

D+ D–
8b
Differential
Scrambler Receiver
8b 1b Clock recovery
circuit
Data recovery
128b/130b Encoding circuit
130b 1b
Parallel to serial Serial to parallel
1b 130b
Transmitter Differential
128b/130b Decoding
Driver
128b
D+ D–
Descrambler
(a) Transmitter
8b
(b) Receiver
Figure 3.24 PCIe Transmit and Receive Block Diagrams

Receives read and write requests from
+ 
the software above the TL and creates
request packets for transmission to a
destination via the link layer
PCIe  Most transactions use a split transaction

technique
Transaction Layer (TL)  A request packet is sent out by a
source PCIe device which then waits
for a response called a completion
packet
 TL messages and some write
transactions are posted transactions
(meaning that no response is
expected)
 TL packet format supports 32-bit

memory addressing and extended
64-bit memory addressing

+
The TL supports four address
spaces:
 Memory  I/O
 The memory space includes  This address space is used
system main memory and
PCIe I/O devices
for legacy PCI devices, with
reserved address ranges
 Certain ranges of memory
addresses map into I/O used to address legacy I/O
devices devices
 Configuration  Message
 This address space enables  This address space is for
the TL to read/write control signals related to
configuration registers interrupts, error handling,
associated with I/O devices and power management

Table 3.2
PCIe TLP Transaction Types
Address Space TLP Type Purpose
Memory Read Request
Transfer data to or from a location in the
Memory Memory Read Lock Request system memory map.
Memory Write Request
I/O Read Request Transfer data to or from a location in the
I/O
I/O Write Request system memory map for legacy devices.
Config Type 0 Read Request
Config Type 0 Write Request Transfer data to or from a location in the
Configuration
Config Type 1 Read Request configuration space of a PCIe device.
Config Type 1 Write Request
Message Request Provides in-band messaging and event
Message reporting.
Message Request with Data
Completion
Memory, I/O, Completion with Data
Returned for certain requests.
Configuration Completion Locked
Completion Locked with Data
Number
of octets
1 STP framing 1 Start
Appended by PL
2 Sequence number
DLLP
Created
by DLL
4
2 CRC
12 or 16 Header 1 End
Created by Transaction Layer
Appended by Data Link Layer
Appended by Physical Layer

0 to 4096 Data
0 or 4 ECRC
4 LCRC
1 STP framing
(a) Transaction Layer Packet (b) Data Link Layer Packet
Figure 3.25 PCIe Protocol Data Unit Format

+ Summary A Top-Level View of
Computer Function
and Interconnection
Chapter 3
 Point-to-point interconnect
 QPI physical layer
 Computer components
 QPI link layer
 Computer function
 QPI routing layer
 Instruction fetch and
execute  QPI protocol layer
 Interrupts  PCI express
 I/O function  PCI physical and logical
 Interconnection structures architecture
 Bus interconnection  PCIe physical layer
 PCIe transaction layer
 PCIe data link layer

CH03 COA10e - Top Level PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

CH03 COA10e - Top Level PDF

Caricato da

Copyright:

Formati disponibili

+

 Contemporary computer designs are based on concepts

 Referred to as the von Neumann architecture and is based on

(a) Programming in hardware

(b) Programming in software

Figure 3.1 Hardware and Software Approaches

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

I/O Module n–2

Figure 3.2 Computer Components: Top-Level View

Fetch Next Execute

Figure 3.3 Basic Instruction Cycle

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 The program counter (PC) holds the address of the

 The processor increments the PC after each instruction

 The fetched instruction is loaded into the instruction

 The processor interprets the instruction and performs the

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

• An instruction may • The processor may

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

(a) Instruction format

(b) Integer format

Program Counter (PC) = Address of instruction

(c) Internal CPU registers

0001 = Load AC from Memory

(d) Partial list of opcodes

Figure 3.4 Characteristics of a Hypothetical Machine

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Figure 3.5 Example of Program Execution

Instruction Instruction Operand Operand

Return for string

Figure 3.6 Instruction Cycle State Diagram

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

I/O I/O I/O

WRITE WRITE 5 WRITE 5

WRITE WRITE WRITE

= interrupt occurs during course of execution of user program

Figure 3.7 Program Flow of Control Without and With Interrupts

Figure 3.8 Transfer of Control via Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Figure 3.9 Instruction Cycle with Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

(b) With interrupts

(a) Without interrupts

Figure 3.10 Program Timing: Short I/O Wait

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

I/O operation; 2 I/O operation

3 (b) With interrupts

(a) Without interrupts

Figure 3.11 Program Timing: Long I/O Wait

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Instruction Instruction Operand Operand

Figure 3.12 Instruction Cycle State Diagram, With Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

(a) Sequential interrupt processing

(b) Nested interrupt processing

Figure 3.13 Transfer of Control with Multiple Interrupts

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Figure 3.14 Example Time Sequence of Multiple Interrupts

 Processor can read data from or write data to an I/O module

 In some cases it is desirable to allow I/O exchanges to occur

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.