5 Busses

COMPUTER
Main
memory
I/O
System
Bus
CPU
CPU
Registers
Structure
Internal
Bus
Control
Unit
CONTROL
UNIT
Sequencing
Logic
Control Unit
Registers and
Decoders
Control
Memory
Figure 1.1 A Top-Down View of a Computer

2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
ALU
Read
Memory
Write
N Words
Address
Data
Read
N1
I/O Module
Write
Address
M Ports
Internal
Data
Internal
Data
External
Data
Interrupt
Signals
External
Data
Address
Instructions
Data
Data
CPU
Interrupt
Signals
Control
Signals
Data
Figure 3.15 Computer Modules

The interconnection structure must support the

following types of transfers:
Memory
to
processor
Processor
reads an
instruction
or a unit of
data from
memory
Processor
to
memory
Processor
writes a unit
of data to
memory
I/O to
processor
Processor
reads data
from an I/O
device via
an I/O
module
Processor
to I/O
I/O to or
from
memory
Processor
sends data
to the I/O
device
An I/O
module is
allowed to
exchange
data
directly with
memory
without
going
through the
processor
using direct
memory
access
Slide
4
Buses and Their Appeal

.
.
.
Control
.
.
.
Address
.
.
.
Data
Handshaking,
direction,
transfer mode,
arbitration, ...
one bit (serial)
to several bytes;
may be shared
The three sets of lines found in a bus.

A typical computer may use a dozen or so different buses:
1. Legacy Buses: PC bus, ISA, RS-232, parallel port
2. Standard buses: PCI, SCSI, USB, Ethernet
3. Proprietary buses: for specific devices and max performance
Computer Architecture, Input/Output and Interfacing
July 2004
What is a bus?
A Bus Is:
shared communication link
single set of wires used to connect multiple subsystems
Processor
Input
Control
Datapath
Memory
Output
A Bus is also a fundamental tool for composing large, complex

systems
systematic means of abstraction
Busses
Advantages of Buses
Processer
I/O
Device
I/O
Device
Versatility:
New devices can be added easily
Peripherals can be moved between computer
systems that use the same bus standard
Low Cost:
A single set of wires is shared in multiple ways
I/O
Device
Memory
Disadvantage of Buses
Processor
I/O
Device
I/O
Device
I/O
Device
Memory
It creates a communication bottleneck

The bandwidth of that bus can limit the maximum I/O throughput
The maximum bus speed is largely limited by:

The length of the bus
The number of devices on the bus
The need to support a range of devices with:
Widely varying latencies
Widely varying data transfer rates
General Organization of a Bus

Control Lines
Data Lines
Control lines:
Signal requests and acknowledgments
Indicate what type of information is on the data lines
Data lines carry information between the source and the destination:
Data and Addresses
Complex commands
Master versus Slave

Master issues command
Bus
Master
Data can go either way
Bus
Slave
A bus transaction includes two parts:

Issuing the command (and address)
Transferring the data
request
action
Master is the one who starts the bus transaction by:

issuing the command (and address)
Slave is the one who responds to the address by:

Sending data to the master if the master ask for data
Receiving data from the master if the master wants to send data
Types of Busses
Processor-Memory Bus (design specific)
Short and high speed
Only need to match the memory system

Maximize memory-to-processor bandwidth
Connects directly to the processor

Optimized for cache block transfers
I/O Bus (industry standard)

Usually is lengthy and slower
Need to match a wide range of I/O devices
Connects to the processor-memory bus or backplane bus
Backplane Bus (standard or proprietary)

Backplane: an interconnection structure within the chassis
Allow processors, memory, and I/O devices to coexist
Cost advantage: one bus for all components
Example: Pentium System

Organization
Processor/Memory
Bus
PCI Bus
I/O Busses
A Computer System with One Bus:

Backplane Bus
Backplane Bus
Processor
Memory
I/O Devices
A single bus (the backplane bus) is used for:

Processor to memory communication
Communication between I/O devices and memory
Advantages: Simple and low cost
Disadvantages: slow and the bus can become a major bottleneck

Example: IBM PC - AT
A Two-Bus System
Processor Memory Bus
Processor
Memory
Bus
Adaptor
I/O
Bus
Bus
Adaptor
Bus
Adaptor
I/O
Bus
I/O
Bus
I/O buses tap into the processor-memory bus via bus adaptors:
Processor-memory bus: mainly for processor-memory traffic
I/O buses: provide expansion slots for I/O devices
Apple Macintosh-II
NuBus: Processor, memory, and a few selected I/O devices
SCCI Bus: the rest of the I/O devices
A Three-Bus System
Processor
Memory
Bus
Adaptor
Backplane Bus
Bus
Adaptor
Bus
Adaptor
I/O Bus
I/O Bus
A small number of backplane buses tap into the processor-memory bus

Processor-memory bus is only used for processor-memory traffic
I/O buses are connected to the backplane bus
Advantage: loading on the processor bus is greatly reduced
North/South Bridge architectures:

separate busses
Processor
backside
cache
Bus
Adaptor
Backplane Bus
Bus
Adaptor
I/O Bus
I/O Bus
Separate sets of pins for different functions
Memory bus
Caches
Graphics bus (for fast frame buffer)
I/O busses are connected to the backplane bus
Advantage:
Busses can run at different speeds
Much less overall loading!
Memory
What defines a bus?

Transaction Protocol
Timing and Signaling Specification
Bunch of Wires
Electrical Specification
Physical / Mechanical Characteristics

the connectors
Synchronous and Asynchronous Bus

Synchronous Bus:
Includes a clock in the control lines
A fixed protocol for communication that is relative to the clock
Advantage: involves very little logic and can run very fast
Disadvantages:
Every device on the bus must run at the same clock rate
To avoid clock skew, they cannot be long if they are fast
Asynchronous Bus:
It is not clocked
It can accommodate a wide range of devices
It can be lengthened without worrying about clock skew

It requires a handshaking protocol
Busses so far
Master
Slave
Control Lines
Address Lines
Data Lines
Bus Master: has ability to control the bus, initiates transaction

Bus Slave: module activated by the transaction
Bus Communication Protocol: specification of sequence of

events and timing requirements in transferring information.
Asynchronous Bus Transfers: control lines (req, ack) serve to
orchestrate sequencing.
Synchronous Bus Transfers: sequence relative to common clock.
Arbitration:
Obtaining Access to the Bus
Control: Master initiates requests
Bus
Master
Data can go either way
Bus
Slave
One of the most important issues in bus design:

How is the bus reserved by a device that wishes to use it?
Chaos is avoided by a master-slave arrangement:

Only the bus master can control access to the bus:
It initiates and controls all bus requests
A slave responds to read and write requests
The simplest system:

Processor is the only bus master
All bus requests must be controlled by the processor

Major drawback: the processor is involved in every transaction
Multiple Potential Bus Masters:

the Need for Arbitration
Bus arbitration scheme:
A bus master wanting to use the bus asserts the bus request
A bus master cannot use the bus until its request is granted
A bus master must signal to the arbiter the end of the bus utilization
Bus arbitration schemes usually try to balance two factors:

Bus priority: the highest priority device should be serviced first
Fairness: Even the lowest priority device should never
be completely locked out from the bus
Bus arbitration schemes can be divided into four broad classes:

Daisy chain arbitration
Centralized, parallel arbitration
Distributed arbitration by self-selection: each device wanting the bus places a
code indicating its identity on the bus.
Distributed arbitration by collision detection:
Each device just goes for it. Problems found after the fact.
The Daisy Chain Bus

Arbitrations Scheme
Device 1
Highest
Priority
Grant
Bus
Arbiter
Device N
Lowest
Priority
Device 2
Grant
Grant
Release
Request
wired-OR
Advantage: simple
Disadvantages:
Cannot assure fairness:
A low-priority device may be locked out indefinitely
The use of the daisy chain grant signal also limits the bus speed
Centralized Parallel Arbitration

Device 1
Grant
Device 2
Device N
Req
Bus
Arbiter
Used in essentially all processor-memory busses and in high-speed

I/O busses
A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable
in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial
system
PCI = Peripheral Component Interconnect
AGP = Accelerated Graphics Port (high speed point to point)
Comparison of Performance Per Pin for Various Busses
Slide
37
Simple Organization for Input/Output

Interrupts
CPU
Main
memory
Cache
System bus
I/O controller
Disk
I/O controller
I/O controller
Graphics
display
Network
Disk
Input/output via a single common bus.

July 2004
Slide
38
I/O Organization for Greater Performance

CPU
Interrupts
Main
memory
Cache
Memory bus
AGP
Bus
adapter
PCI bus
Intermediate
buses / ports
Bus
adapter
I/O bus
I/O controller
Graphics
display
Bus
adapter
I/O controller
Network
I/O controller
Disk
I/O controller
Disk
CD/DVD
Input/output via intermediate and dedicated I/O buses.

July 2004
PCI Bus Based Platform
The shared bus design (even in the case of multiple

busses, each of which has a data, address, and
control bus) was the prevalent CPU/components
interconnect architecture for decades. Regardless of
which arbitration or timing scheme, or type of bus is
in use, any shared bus architecture has limitations
whenever the capacity of the bus is reached.
The solution Physically connect computer devices
which you know are often exchanging
information/data.
Point-to-Point Interconnect
Principal reason for change
was the electrical constraints
encountered with increasing
the frequency of wide
synchronous buses
At higher and higher data

rates it becomes
increasingly difficult to
perform the synchronization
and arbitration functions in a
timely fashion
A conventional shared bus

on the same chip magnified
the difficulties of increasing
bus data rate and reducing
bus latency to keep up with
the processors
Point-to-point interconnect
has lower latency, higher
data rate, and better
scalability
Latency is the time it takes for the data requested by the CPU to start arriving.
Bandwidth is the rate at which the data arrives..
Each of the point is connected to diriectly each of the other 4 points.

Point to Point Interconnect

Significant Characteristics
Multiple direct connections: direct pairwise
connections to other components,
eliminates arbitration (as in shared system)
Layered protocol architecture: similar to

layered data networks rather than use of
control signals
Packetised data transfer: sequence of
packets, each of which includes comtrol
headers and error controls
Peripheral Component Interconnect

(PCI) Express
PCI Express (PCIe)
Point-to-point interconnect scheme
intended to replace bus-based schemes
such as PCI
Key requirement is high capacity to support

the needs of higher data rate I/O devices,
such as Gigabit Ethernet
Another requirement deals with the need to
support time dependent data streams
Core
Core
Chipset = Host Bridge

Gigabit
Ethernet
PCIe
Memory
Chipset
PCIePCI
Bridge
PCIe
Memory
PCIe
PCIe
PCIe
Legacy
endpoint
PCIe
Switch
PCIe
endpoint
PCIe
PCIe
endpoint
PCIe
endpoint
Figure 3.21 Typical Configuration Using PCIe
From electronicdesign.com
All information moves across an active PCI Express link in

fundamental chunks called packets.
The two major classes of packets exchanged between two PCIe
devices are high level Transaction Layer Packets (TLPs), and low
level link maintenance packets called Data Link Layer Packets
(DLLPs).
Collectively, the various TLPs and DLLPs allow two devices to
perform memory, IO and Configuration Space transactions
reliably and use messages to initiate power management
events, generate interrupts, report errors etc.
Each PCIe packet has a known size and format.

The packet header -- positioned at the beginning of each DLLP
and TLP packet indicates the packet type and presence of any
optional fields.
The size of each packet field is either fixed or defined by the

packet (transaction) types i.e. memory, I/O, configuration,
message.
The size of any data payload is conveyed in the TLP header
Length field
Transaction
Data Link
Transaction layer
packets (TLP)
Data link layer
packets (DLLP)
Physical
Transaction
Data Link
Physical
Figure 3.22 PCIe Protocol Layers

The TL supports four address spaces:

Memory
I/O
The memory space includes system

main memory and PCIe I/O
devices
This address space is used for

legacy PCI devices, with reserved
address ranges used to address
legacy I/O devices
Certain ranges of memory

addresses map into I/O devices
Configuration
Message
This address space enables the TL

to read/write configuration
registers associated with I/O
devices
This address space is for control

signals related to interrupts, error
handling, and power management
Table 3.2
PCIe TLP Transaction Types
Address Space
Memory
I/O
Configuration
Message
Memory, I/O,
Configuration
TLP Type
Memory Read Request
Memory Read Lock Request
Memory Write Request
I/O Read Request
I/O Write Request
Config Type 0 Read Request
Config Type 0 Write Request
Config Type 1 Read Request
Config Type 1 Write Request
Message Request
Message Request with Data
Completion
Completion with Data
Completion Locked
Completion Locked with Data
Purpose
Transfer data to or from a location in the
system memory map.
system memory map for legacy devices.
configuration space of a PCIe device.
Provides in-band messaging and event
reporting.
Returned for certain requests.
PCIe Layered Protocol and TLP Assembly/Disassembly
Quick Path Interconnect

Introduced in 2008
Multiple direct connections

Direct pairwise connections to other components
eliminating the need for arbitration found in shared
transmission systems
Layered protocol architecture

These processor level interconnects use a layered
protocol architecture rather than the simple use of
control signals found in shared bus arrangements
Packetized data transfer

Data are sent as a sequence of packets each of
which includes control headers and error control
codes
QPI
Intel Shared Front Side Bus , up until 2004

- For Intel Xeon 64-bit processor and
- Intel Itanium 128-bit processor
Intel Dual Independent Busses

circa 2005
Intel Dedicated High Speed

Interconnects - 2007
QPI
I/O Hub
PCI Express
I/O device
DRAM
Core
D
DRAM
Core
C
I/O device
I/O device
DRAM
Core
B
I/O device
Core
A
DRAM
I/O Hub
Memory bus
Figure 3.17 Multicore Configuration Using QPI
Packets
Protocol
Protocol
Routing
Routing
Link
Physical
Flits
Phits
Figure 3.18 QPI Layers
Link
Physical
COMPONENT A
Fwd Clk
Transmission Lanes
Reception Lanes
Rcv Clk
Rcv Clk
Reception Lanes
Transmission Lanes
Fwd Clk
Intel QuickPath Interconnect Port
Intel QuickPath Interconnect Port

COMPONENT B
Figure 3.19 Physical Interface of the Intel QPI Interconnect

Contains multiple direct pairwise connections

between components, eliminating the need
for arbitration found in shared bus systems
Has a layered protocol architecture, similar
in design to the classical network protocols
that govern todays networks (Internet, for
example)
Relies on the concept of packets, which
are bundles of information with control, error,
data payloads, etc.
QPI Link Layer

Flow control function
Performs two key

functions: flow control
and error control
Needed to ensure that a

sending QPI entity does not
overwhelm a receiving QPI
entity by sending data faster
than the receiver can
process the data and clear
buffers for more incoming
data
Operate on the
level of the flit (flow
control unit)
Each flit consists of a

72-bit message
payload and an 8bit error control
code called a
cyclic redundancy
check (CRC)
Error control function
Detects and recovers

from bit errors, and so
isolates higher layers
from experiencing bit
errors
QPI Routing and Protocol Layers

Routing Layer
Used to determine the
course that a packet will
traverse across the
available system
interconnects
Defined by firmware and
describe the possible paths
that a packet can follow
Protocol Layer
Packet is defined as the unit

of transfer
One key function
performed at this level is a
cache coherency protocol
which deals with making
sure that main memory
values held in multiple
caches are consistent
A typical data packet

payload is a block of data
being sent to or from a
cache
Protocol Layer : The high-level set of rules for

exchanging packets of data between devices
Routing Layer : Provides the framework for directing
packets from one location of the network to another.
Rely on routing algorithms to determine fastest path, best
and least-congested route, etc.
Link Layer : Responsible for reliable transmission and
flow control. The Link layers unit of transfer is an 80-bit
Flit (flow control unit).
Physical Layer : the actual wires carrying the signals, as well
as circuitry and logic to support transmission and receipt of
1s and 0s. The unit of transfer at the Physical layer is 20 bits,
which is called a Phit (physical unit).

5 Busses

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

5 Busses

Caricato da

Copyright:

Formati disponibili

COMPUTER

Figure 1.1 A Top-Down View of a Computer

Figure 3.15 Computer Modules

The interconnection structure must support the

2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Buses and Their Appeal

The three sets of lines found in a bus.

single set of wires used to connect multiple subsystems

A Bus is also a fundamental tool for composing large, complex

It creates a communication bottleneck

The maximum bus speed is largely limited by:

General Organization of a Bus

Master versus Slave

Data can go either way

A bus transaction includes two parts:

Master is the one who starts the bus transaction by:

Slave is the one who responds to the address by:

Only need to match the memory system

Connects directly to the processor

I/O Bus (industry standard)

Backplane Bus (standard or proprietary)

Cost advantage: one bus for all components

Example: Pentium System

A Computer System with One Bus:

A single bus (the backplane bus) is used for:

Advantages: Simple and low cost

Disadvantages: slow and the bus can become a major bottleneck

A small number of backplane buses tap into the processor-memory bus

Advantage: loading on the processor bus is greatly reduced

North/South Bridge architectures:

Processor Memory Bus

Separate sets of pins for different functions

What defines a bus?

Physical / Mechanical Characteristics

Synchronous and Asynchronous Bus

To avoid clock skew, they cannot be long if they are fast

It can be lengthened without worrying about clock skew

Bus Master: has ability to control the bus, initiates transaction

Bus Communication Protocol: specification of sequence of

Data can go either way

One of the most important issues in bus design:

Chaos is avoided by a master-slave arrangement:

A slave responds to read and write requests

The simplest system:

All bus requests must be controlled by the processor

Multiple Potential Bus Masters:

Bus arbitration schemes usually try to balance two factors:

Bus arbitration schemes can be divided into four broad classes:

The Daisy Chain Bus

Centralized Parallel Arbitration

Used in essentially all processor-memory busses and in high-speed

PCI = Peripheral Component Interconnect

AGP = Accelerated Graphics Port (high speed point to point)

Comparison of Performance Per Pin for Various Busses

Simple Organization for Input/Output

Input/output via a single common bus.

I/O Organization for Greater Performance

Input/output via intermediate and dedicated I/O buses.

PCI Bus Based Platform

The shared bus design (even in the case of multiple

At higher and higher data

A conventional shared bus

Each of the point is connected to diriectly each of the other 4 points.