Sei sulla pagina 1di 29

Introduction to Software Distributed Shared Memory Systems

Chang-Yi Lin 2004 / 02 / 26

Outlines
What is a software DSM system? Message-passing vs. Shared-memory How does it work? Memory Consistency Models Cache Coherence Implementation Levels Granularity

What is a software DSM system?


A distributed-memory system (often called a multicomputer) consist of multiple independent processing nodes with local memory modules, connected by a general interconnection network. Global shared memory.

What is a software DSM system?


A DSM system logically implements the shared-memory model on a physically distributed-memory system. The DSM system hides the remote communication mechanism from the application writer, preserving the programming ease and portability typical of shared-memory systems.

Message-passing vs. Shared-memory


Two different programming type. Shared-memory programming is easier.

Message-passing vs. Shared-memory

Message-passing

Point-to-point communication

Message-passing vs. Shared-memory

Message-passing

buffers and data types blocking and nonblocking

Message-passing vs. Shared-memory

Message-passing vs. Shared-memory

Pthread
Thread 1 Lock A = A++; unlock Thread 2 Lock A = A++; unlock Thread 3 Lock A = A++; unlock

How does it work?


Shared-memory applications DSM system Memory Interconnection network

How does it work?


App DSM Mem App DSM Mem App DSM Mem App DSM Mem

Interconnection network

How does it work?

Memory Consistency Models

What is Memory consistency?

P1: w(x)1 R(x)1 ---------------------------------- time

P1: w(x)1 P2: R(x)? ---------------------------------------- time

Memory Consistency Models


A consistency model is essentially a contract between the software and the memory. If the software agrees to obey certain rules, the memory promises to work correctly. If the software violates these rules, correctness of memory operation is no longer guaranteed.

Memory Consistency Models


Strict consistency Sequential consistency (SC) Release consistency (RC) Scope consistency (ScC)

Memory Consistency Models

Strict consistency

Definition: any read to memory location x returns the value stored by the most recent write operation to x. Impossible to DSM
P1: w(x)1 P2: R(x)1 ---------------------------------------- time

Memory Consistency Models

Sequential consistency (SC)

Definition: the result of any execution is the same as if the operations of all processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program. Any valid interleaving is acceptable behavior, but all processes must see the same sequence of memory reference.

Memory Consistency Models

Sequential consistency (SC)


P1 (A) a=1; (B) Print(b,c); P2 (C) b=1; (D) Print(a,c); P3 (E) c=1; (F) Print(a,b);

(A) (B) (C) (D) (E) (F)

(A) (C) (D) (B) (E) (F)

(C) (E) (F) (D) (A) (B)

(A) (C) (E) (B) (D) (F)

Memory Consistency Models

Release Consistency (RC)

Two types of access


Ordinary access: read and write Synchronization access: acquire lock, release lock and barrier

Rules:
Before an ordinary access to a shared variable is performed, all previous acquires done by the process must have completed successfully. Before a release is allowed to be performed, all previous reads and writes done by the process must have completed.

Memory Consistency Models

Release Consistency (RC)


Rule1: Rule2: Acq(L1)Rel(L1) Acq(L5) Acq(L3)R(x) Acq(L2) w(x) R(y) R(z) Rel(L2)

Example: P1: Acq(L) w(x)1 w(x)2 Rel(L) P2: P3: Acq(L) R(x)2 Rel(L) R(x)?

Memory Consistency Models

Scope consistency (ScC)

Memory Consistency Models

Relaxing consistency permits temporary inconsistencies (delayed updates)

Lazy release consistency (LRC) (TreadMarks, CVM) Scope consistency (ScC) (JIAJIA, JUMP)

Cache Coherence

Write invalidate

Suffer from false sharing Too expansive when many replicas Work best in application with tight sharing

Write update

Implementation Levels

Modifying OS kernel

IVY (SC): modifying the memory management unit (MMU) of OS to map between the shared virtual memory address space and the local memory. Linda, Orca Trademarks, CVM, JIAJIA, JUMP, Brazos

Language level

User-level runtime library

Combination of multiple implementation levels, even hardware support

Munin, Midway, NCP2

Granularity

The choice of the block size depends on

the cost of communcation

1 byte message v.s. 1024 byte message

Locality of reference in the application

Most DSM systems use a page-based granularity with 1K byte to 8K byte. Larger page size, better locality of reference

IVY Munin TreadMarks CVM Midway NCP2 Quarks softFLASH Cashmere-2L Brazos

Yale Rice Rice Maryland CMU UFRJ, Brail Utah Standford Rochester Rice

(1KB) (4KB) (4KB) region (16KB) (8KB)

SC Eager RC LRC LRC-MW LRCSW SC EC PC RC EC RC RC SC RC DIRC HLRC ScC

WI WU/WI MW WI WU WU WU/WI WU/WI MW FLASH-like WU Early update WU WI WI

Shasta Mermaid

DEC WRL Toronto

SC

(1KB 8KB) SC

Dsoftware DSM6K IBM Research Mirage JIAJIA Simple-COMA Blizzard-S Shrimp Linda Orca UCLA

(4KB) 512Bytes (4KB)

SC SC ScC SC SC AURC SC SC

WI WI WI WI WI WU/WI Impl.dependent WU

SICS(Sweden) and SUN Wisconsin Princeton Yale Vrije Univ.,

Potrebbero piacerti anche