Sei sulla pagina 1di 44

Parallel Computer Architecture

The End of the Road

Advantages of Multiprocessors
Able to create powerful computers by
simply connecting multiple processors performance single processor

More cost-effective than building a high Obtain fault-tolerance to carry on the


tasks, albeit with degraded performance

4 Decades of Computing

Batch Era (1960s)

IBM System/360 mainframe dominated the corporate computer centers (10 MB disk, 1 MB magnetic core memory) Typical batch processing machine No connection beyond the computer room

Time-Sharing Era (1970s)



Advancing in ss-memory & ICs spawned the minicomputer era Small, fast, and inexpensive enough to be spread throughout the company at the divisional level Still too expensive and difcult to use to hand over to end-users

Time-sharing computing Existing 2 kinds:

centralized data processing mainframes time-sharing minicomputers

Desktop Era (1980s)



PCs were introduced in 1977 Many players (Altairs, Tandy, Commondore, Apple, IBM, and etc) Became pervasive and change the face of computing Along came networked computers (LAN & WAN)

Network Era (1990s)



Advance network technologies led to network computing paradigm Transition from a processorcentric view of computing to a network-centric view A number of commercial parallel computers with multiple processors:

Shared memory systems Distributed memory systems

Four Decades of Computing


Feature Decade Location Users Data Objective Interface Operation Connectivity Owners Batch 1960s Time-Sharing 1970s Desktop 1980s Desktop Individuals Fonts, graphs Present See & point Layout LAN Departmental end-users Network 1990s Mobile Groups Multimedia Communicate Ask & tell Orchestrate Internet Everyone
Computer Room Terminal Room

Experts Alphanumeric Calculate Punched card Process None


Corporate computer centers

Specialists Text, numbers Access Kbd & CRT Edit Peripheral cable Divisional IS shops

Current Trends

The substitution of expensive and specialized parallel machines by the more cost-effective clusters of workstations

A cluster is a collection of stand-alone computers connected using some interconnection network

A pervasiveness of the Internet created interest in network computing and more recently in grid computing

Grids are geographically distributed platforms of computation - dependable, consistent, pervasive, and less expensive access to HPC facilities

Flynns Taxonomy of Computer Architecture


Based on the notion of a stream of
information

instruction data

CPU

fetch
Memory

execute
(manipulate data as programmed)

Single Instruction

Multiple Instruction

Single Data

SISD

MISD

Multiple Data

SIMD

MIMD

SIMD Architecture

Single Instruction, Multiple Data (SIMD)


prev instruction load A(1) load B(1) C(1)=A(1)*B(1) store C(1) next instruction prev instruction load A(2) load B(2) C(2)=A(2)*B(2) store C(2) next instruction prev instruction load A(n) load B(n) C(n)=A(n)*B(n) store C(n) next instruction

time

P1

P2

Pn

MIMD Architecture
Instruction Stream Control Unit-1 Instruction Stream P1 Data Stream M1

Instruction Stream Control Unit-n Instruction Stream Pn

Data Stream Mn

Multiple Instruction, Multiple Data (MIMD)


prev instruction load A(1) load B(1) C(1)=A(1)*B(1) store C(1) next instruction prev instruction call funcD x=y^z sum=x^2 call sub1(i,j) next instruction prev instruction do 10 i=1,N alpha=w**3 zeta=C(i) 10 continue next instruction

time

P1

P2

Pn

SIMD Architecture Model

Consists of two parts:

a front-end computer a processor array

each element in the processor array is identical to one another and performs operation on different data in sync front-end can access PEs memory via the bus

SIMD Architecture Model



lock-step synchronization Processors either do nothing or exactly the same ops simultaneously In SIMD, parallelism is exploited by applying simultaneous operations across large sets of data

SIMD Congurations
Control Unit P1 P2 P3 Pn-1 Pn

Each PE has its own local memory

M1

M2

M3

Mn-1

Mn

Interconnection Network

Control Unit

P1

P2

P3

Pn-1

Pn

PEs and memory modules communicate via the IN

Interconnection Network

M1

M2

M3

Mn-1

Mn

ILLIAC IV

Control Unit

P1

P2

P3

Pn-1

Pn

M1

M2

M3

Mn-1

Mn

Interconnection Network

MIMD Architecture
M M M M Interconnection Network P P P P

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

Shared Memory MIMD Architecture

Interconnection Network

Interconnection Network P P P P P P P P

Shared Memory MIMD Architecture

Message Passing MIMD Architecture


Figure 1.6 Shared memory versus message passing architecture.

Interconnection Network

Commercial examples of SMPs are Sequent Computers Balance and Symmetry, Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor servers. P P P P A message passing system (also referred to as distributed memory) typically combines the local memory and processor at each node of the interconnection network. M M M M There is no global memory, so it is necessary to move data from one local memory to another by means of message passing. This is typically done by a Send/Receive pair Message Passing MIMD Architecture of commands, which must be written into the application software by a programmer. Figure 1.6 Shared memory versus message passing architecture. Thus, programmers must learn the message-passing paradigm, which involves data copying and dealing with consistency issues. Commercial examples of message passing architectures c. 1990 were the nCUBE, iPSC/2, and various Transputer-based systems. These systems eventually gave way to Internet connected systems whereby Commercial examples of SMPs are Sequent Computers Balance and Symmetry, the processor/memory nodes were either Internet servers or clients on individuals Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor

information exchange through central shared memory

information exchange through network in message passing systems

MIMD Architecture

P

NTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

using bus/cache architecture called SMP (symmetric multiprocessor) since

Interconnection Network

Shared Memory MIMD Architecture

equal chance to read/ write memory equal access speed

Interconnection Network

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCE

MIMD Architecture
Interconnection Network

also known as distributed memory no global memory using message passing to move data from one to another (Send/Recieve Figure 1.6 pair of commands)

Shared Memory MIMD Architecture

Interconnection Network

Message Passing MIMD Architecture


Shared memory versus message passing architecture.

this architecture give Commercial examples of SMPs are Sequent Computers Balance and Symm Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiproc way to Internet servers. A message passing system (also referred to as distributed memory) typically connected systems bines the local memory and processor at each node of the interconnection net

There is no global memory, so it is necessary to move data from one local mem

MIMD Architecture
M M M M Interconnection Network P P P P

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

INTRODUCTION TO ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

Shared Memory MIMD Architecture

Interconnection Network

Interconnection Network P P P P P P P P

Shared Memory MIMD Architecture

Message Passing MIMD Architecture


Figure 1.6 Shared memory versus message passing architecture.

Interconnection Network

Commercial examples of SMPs are Sequent Computers Balance and Symmetry, Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor servers. P P P P A message passing system (also referred to as distributed memory) typically combines the local memory and processor at each node of the interconnection network. M M M M There is no global memory, so it is necessary to move data from one local memory to another by means of message passing. This is typically done by a Send/Receive pair Message Passing MIMD Architecture of commands, which must be written into the application software by a programmer. Figure 1.6 Shared memory versus message passing architecture. Thus, programmers must learn the message-passing paradigm, which involves data copying and dealing with consistency issues. Commercial examples of message passing architectures c. 1990 were the nCUBE, iPSC/2, and various Transputer-based systems. These systems eventually gave way to Internet connected systems whereby Commercial examples of SMPs are Sequent Computers Balance and Symmetry, the processor/memory nodes were either Internet servers or clients on individuals Sun Microsystems multiprocessor servers, and Silicon Graphics Inc. multiprocessor

programming is easier

provided scalability

DSM (distributed-shared memory) is the hybrid between the two

DSM
memory is physically distributed [message
passing]

memory can be addressed as one (logically


shared) address space [shared memory]

programming-wise, the architecture looks

and behaves like a shared memory machine, but a message passing architecture lives underneath the software

SGI Origin2000

SIMD
Control Unit
Control Unit

P1

P2

P3

Pn-1

Pn

P1

P2

P3

Pn-1

Pn

M1

M2

M3

Mn-1

Mn

Interconnection Network

Interconnection Network

M1

M2

M3

Mn-1

Mn

access control - which process accesses are


possible to which resources

synchronization - constraints limit the time


of accesses from sharing processes to shared resources

SIMD
Control Unit
Control Unit

P1

P2

P3

Pn-1

Pn

P1

P2

P3

Pn-1

Pn

M1

M2

M3

Mn-1

Mn

Interconnection Network

Interconnection Network

M1

M2

M3

Mn-1

Mn

protection - a system feature that prevents


processes from making arbitrary access to resources belonging to other processes

Interconnection Network

MIMD
P

Interconnection Network P P P P P P P

Shared Memory MIMD Architecture

Message Passing MIMD Architecture

nodes are typically able to simultaneously

store messages in buffers perform send/receive operations

scalable - the number of processors can be increased without signicant decrease in efciency of operation

Interconnection Networks

Interconnection Networks (INs)


Can be classied based on mode of operation control strategy switching techniques topology

Mode of Operation

Accordingly, INs are classied as:

Synchronous

a single global clock used by all operating in a lock-step manner

Asynchronous does not require a global clock handshaking signals are used

Sync tends to be slower than async, sync is race and hazard-free, however.

Control Strategy

Accordingly, INs are classied as

Centralized a single central CU is used to oversee


and control the operation

Decentralized the control function is distributed


among different components

Control Strategy
The function and reliability of the central
the multistage interconnection networks are decentralized control unit can become the bottleneck in a centralized control system

While the crossbar is a centralized system,

Switching Techniques

INs can be classied as:

circuit switching

a complete path has to be established and remain existence during the whole communication

packet switching communication takes place via messages that are divided into smaller entities (packets) packets travel in a store-and-forward manner

While packet s/w tends to use resources more efciently, it suffers from variable packet delays

Topology
Topology describes how to connect
processors and memories to other processors and memories

Shared Memory INs


bus-based
P

switch-based
C

Global Memory
P P C C

C P

C P

C
P C

P
M M M M

Message Passing INs


Static interconnection network Dynamic interconnection network

Static INs

Linear Array

Ring

Mesh

Tree

Hypercube

Dynamic INs
Establish a connection between two or
more nodes on the y as messages are routed along the links

The number of hops in a path from source

to destination node is equal to the number of point-to-point links a message must traverse to reach its destination

Single-stage

Multiple-stage

Crossbar switch

Potrebbero piacerti anche