Slide02 Parallel Computers

Parallel Computer Architecture
The End of the Road
Advantages of Multiprocessors
Able to create powerful computers by
simply connecting multiple processors performance single processor
More cost-effective than building a high Obtain fault-tolerance to carry on the

tasks, albeit with degraded performance
4 Decades of Computing
Batch Era (1960s)
IBM System/360 mainframe dominated the corporate computer centers (10 MB disk, 1 MB magnetic core memory) Typical batch processing machine No connection beyond the computer room
Time-Sharing Era (1970s)

Advancing in ss-memory & ICs spawned the minicomputer era Small, fast, and inexpensive enough to be spread throughout the company at the divisional level Still too expensive and difcult to use to hand over to end-users
Time-sharing computing Existing 2 kinds:
centralized data processing mainframes time-sharing minicomputers
Desktop Era (1980s)

PCs were introduced in 1977 Many players (Altairs, Tandy, Commondore, Apple, IBM, and etc) Became pervasive and change the face of computing Along came networked computers (LAN & WAN)
Network Era (1990s)

Advance network technologies led to network computing paradigm Transition from a processorcentric view of computing to a network-centric view A number of commercial parallel computers with multiple processors:
Shared memory systems Distributed memory systems
Four Decades of Computing

Feature Decade Location Users Data Objective Interface Operation Connectivity Owners Batch 1960s Time-Sharing 1970s Desktop 1980s Desktop Individuals Fonts, graphs Present See & point Layout LAN Departmental end-users Network 1990s Mobile Groups Multimedia Communicate Ask & tell Orchestrate Internet Everyone
Computer Room Terminal Room
Experts Alphanumeric Calculate Punched card Process None

Corporate computer centers
Specialists Text, numbers Access Kbd & CRT Edit Peripheral cable Divisional IS shops
Current Trends
The substitution of expensive and specialized parallel machines by the more cost-effective clusters of workstations
A cluster is a collection of stand-alone computers connected using some interconnection network
A pervasiveness of the Internet created interest in network computing and more recently in grid computing
Grids are geographically distributed platforms of computation - dependable, consistent, pervasive, and less expensive access to HPC facilities
Flynns Taxonomy of Computer Architecture

Based on the notion of a stream of
information
instruction data
CPU
fetch
Memory
execute
(manipulate data as programmed)
Single Instruction
Multiple Instruction
Single Data
SISD
MISD
Multiple Data
SIMD
MIMD
SIMD Architecture
Single Instruction, Multiple Data (SIMD)

prev instruction load A(1) load B(1) C(1)=A(1)*B(1) store C(1) next instruction prev instruction load A(2) load B(2) C(2)=A(2)*B(2) store C(2) next instruction prev instruction load A(n) load B(n) C(n)=A(n)*B(n) store C(n) next instruction
time
P1
P2
Pn
MIMD Architecture
Instruction Stream Control Unit-1 Instruction Stream P1 Data Stream M1
Instruction Stream Control Unit-n Instruction Stream Pn
Data Stream Mn
Multiple Instruction, Multiple Data (MIMD)

prev instruction load A(1) load B(1) C(1)=A(1)*B(1) store C(1) next instruction prev instruction call funcD x=y^z sum=x^2 call sub1(i,j) next instruction prev instruction do 10 i=1,N alpha=w**3 zeta=C(i) 10 continue next instruction
time
P1
P2
Pn
SIMD Architecture Model
Consists of two parts:
a front-end computer a processor array
each element in the processor array is identical to one another and performs operation on different data in sync front-end can access PEs memory via the bus
SIMD Architecture Model

lock-step synchronization Processors either do nothing or exactly the same ops simultaneously In SIMD, parallelism is exploited by applying simultaneous operations across large sets of data
SIMD Congurations
Control Unit P1 P2 P3 Pn-1 Pn
Each PE has its own local memory
M1
M2
M3
Mn-1
Mn
Interconnection Network
Control Unit
P1
P2
P3
Pn-1
Pn
PEs and memory modules communicate via the IN
M1
M2
M3
Mn-1
Mn
ILLIAC IV
Control Unit
P1
P2
P3
Pn-1
Pn
M1
M2
M3
Mn-1
Mn
MIMD Architecture
M M M M Interconnection Network P P P P
INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
Shared Memory MIMD Architecture
Interconnection Network P P P P P P P P
Message Passing MIMD Architecture

Figure 1.6 Shared memory versus message passing architecture.
Commercial examples of SMPs are Sequent Computers Balance and Symmetry, Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor servers. P P P P A message passing system (also referred to as distributed memory) typically combines the local memory and processor at each node of the interconnection network. M M M M There is no global memory, so it is necessary to move data from one local memory to another by means of message passing. This is typically done by a Send/Receive pair Message Passing MIMD Architecture of commands, which must be written into the application software by a programmer. Figure 1.6 Shared memory versus message passing architecture. Thus, programmers must learn the message-passing paradigm, which involves data copying and dealing with consistency issues. Commercial examples of message passing architectures c. 1990 were the nCUBE, iPSC/2, and various Transputer-based systems. These systems eventually gave way to Internet connected systems whereby Commercial examples of SMPs are Sequent Computers Balance and Symmetry, the processor/memory nodes were either Internet servers or clients on individuals Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor
information exchange through central shared memory
information exchange through network in message passing systems
MIMD Architecture

P
NTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
using bus/cache architecture called SMP (symmetric multiprocessor) since
equal chance to read/ write memory equal access speed
INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCE
MIMD Architecture
also known as distributed memory no global memory using message passing to move data from one to another (Send/Recieve Figure 1.6 pair of commands)

Shared memory versus message passing architecture.
this architecture give Commercial examples of SMPs are Sequent Computers Balance and Symm Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiproc way to Internet servers. A message passing system (also referred to as distributed memory) typically connected systems bines the local memory and processor at each node of the interconnection net
There is no global memory, so it is necessary to move data from one local mem
MIMD Architecture
M M M M Interconnection Network P P P P
Interconnection Network P P P P P P P P

Figure 1.6 Shared memory versus message passing architecture.
Commercial examples of SMPs are Sequent Computers Balance and Symmetry, Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor servers. P P P P A message passing system (also referred to as distributed memory) typically combines the local memory and processor at each node of the interconnection network. M M M M There is no global memory, so it is necessary to move data from one local memory to another by means of message passing. This is typically done by a Send/Receive pair Message Passing MIMD Architecture of commands, which must be written into the application software by a programmer. Figure 1.6 Shared memory versus message passing architecture. Thus, programmers must learn the message-passing paradigm, which involves data copying and dealing with consistency issues. Commercial examples of message passing architectures c. 1990 were the nCUBE, iPSC/2, and various Transputer-based systems. These systems eventually gave way to Internet connected systems whereby Commercial examples of SMPs are Sequent Computers Balance and Symmetry, the processor/memory nodes were either Internet servers or clients on individuals Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor
programming is easier
provided scalability
DSM (distributed-shared memory) is the hybrid between the two
DSM
memory is physically distributed [message
passing]
memory can be addressed as one (logically

shared) address space [shared memory]
programming-wise, the architecture looks
and behaves like a shared memory machine, but a message passing architecture lives underneath the software
SGI Origin2000
SIMD
Control Unit
Control Unit
P1
P2
P3
Pn-1
Pn
P1
P2
P3
Pn-1
Pn
M1
M2
M3
Mn-1
Mn
M1
M2
M3
Mn-1
Mn
access control - which process accesses are

possible to which resources
synchronization - constraints limit the time

of accesses from sharing processes to shared resources
SIMD
Control Unit
Control Unit
P1
P2
P3
Pn-1
Pn
P1
P2
P3
Pn-1
Pn
M1
M2
M3
Mn-1
Mn
M1
M2
M3
Mn-1
Mn
protection - a system feature that prevents

processes from making arbitrary access to resources belonging to other processes
MIMD
P
Interconnection Network P P P P P P P
nodes are typically able to simultaneously
store messages in buffers perform send/receive operations
scalable - the number of processors can be increased without signicant decrease in efciency of operation
Interconnection Networks
Interconnection Networks (INs)

Can be classied based on mode of operation control strategy switching techniques topology
Mode of Operation
Accordingly, INs are classied as:
Synchronous
a single global clock used by all operating in a lock-step manner
Asynchronous does not require a global clock handshaking signals are used
Sync tends to be slower than async, sync is race and hazard-free, however.
Control Strategy
Accordingly, INs are classied as
Centralized a single central CU is used to oversee

and control the operation
Decentralized the control function is distributed

among different components
Control Strategy
The function and reliability of the central
the multistage interconnection networks are decentralized control unit can become the bottleneck in a centralized control system
While the crossbar is a centralized system,
Switching Techniques
INs can be classied as:
circuit switching
a complete path has to be established and remain existence during the whole communication
packet switching communication takes place via messages that are divided into smaller entities (packets) packets travel in a store-and-forward manner
While packet s/w tends to use resources more efciently, it suffers from variable packet delays
Topology
Topology describes how to connect
processors and memories to other processors and memories
Shared Memory INs

bus-based
P
switch-based
C
Global Memory
P P C C
C P
C P
C
P C
P
M M M M
Message Passing INs

Static interconnection network Dynamic interconnection network
Static INs
Linear Array
Ring
Mesh
Tree
Hypercube
Dynamic INs
Establish a connection between two or
more nodes on the y as messages are routed along the links
The number of hops in a path from source
to destination node is equal to the number of point-to-point links a message must traverse to reach its destination
Single-stage
Multiple-stage
Crossbar switch

Slide02 Parallel Computers

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Slide02 Parallel Computers

Caricato da

Copyright:

Formati disponibili

Parallel Computer Architecture

The End of the Road

More cost-effective than building a high Obtain fault-tolerance to carry on the

Batch Era (1960s)

Time-Sharing Era (1970s)

Time-sharing computing Existing 2 kinds:

centralized data processing mainframes time-sharing minicomputers

Desktop Era (1980s)

Network Era (1990s)

Shared memory systems Distributed memory systems

Four Decades of Computing

Experts Alphanumeric Calculate Punched card Process None

A cluster is a collection of stand-alone computers connected using some interconnection network

Flynns Taxonomy of Computer Architecture

Single Instruction, Multiple Data (SIMD)

Instruction Stream Control Unit-n Instruction Stream Pn

Multiple Instruction, Multiple Data (MIMD)

SIMD Architecture Model

Consists of two parts:

a front-end computer a processor array

SIMD Architecture Model

Each PE has its own local memory

PEs and memory modules communicate via the IN

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

Shared Memory MIMD Architecture

Shared Memory MIMD Architecture

Message Passing MIMD Architecture

information exchange through central shared memory

information exchange through network in message passing systems

NTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

using bus/cache architecture called SMP (symmetric multiprocessor) since

Shared Memory MIMD Architecture

equal chance to read/ write memory equal access speed

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCE

Shared Memory MIMD Architecture

Message Passing MIMD Architecture

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

Shared Memory MIMD Architecture

Shared Memory MIMD Architecture

Message Passing MIMD Architecture

DSM (distributed-shared memory) is the hybrid between the two

memory can be addressed as one (logically

programming-wise, the architecture looks

access control - which process accesses are

synchronization - constraints limit the time

protection - a system feature that prevents

Shared Memory MIMD Architecture

Message Passing MIMD Architecture

nodes are typically able to simultaneously

store messages in buffers perform send/receive operations

Interconnection Networks (INs)

Accordingly, INs are classied as:

a single global clock used by all operating in a lock-step manner

Accordingly, INs are classied as

Centralized a single central CU is used to oversee

Decentralized the control function is distributed

While the crossbar is a centralized system,

INs can be classied as:

Shared Memory INs

Message Passing INs

The number of hops in a path from source

Potrebbero piacerti anche