Chapter 11: Distributed File Systems

Distributed Systems
Principles and Paradigms
Maarten van Steen
VU Amsterdam, Dept. Computer Science

Room R4.20, steen@cs.vu.nl
Chapter 11: Distributed File Systems

Version: December 4, 2011
1 / 17
Contents
Chapter
01: Introduction
02: Architectures
03: Processes
04: Communication
05: Naming
06: Synchronization
07: Consistency & Replication
08: Fault Tolerance
09: Security
10: Distributed Object-Based Systems
11: Distributed File Systems
12: Distributed Web-Based Systems
13: Distributed Coordination-Based Systems
2 / 17 2 / 17
Distributed File Systems 11.1 Architecture Distributed File Systems 11.1 Architecture
Distributed File Systems
General goal
Try to make a file system transparently available to remote clients.
1. File moved to client

Client Server Client Server
Old file
New file
Requests from
client to access File stays 2. Accesses are
3. When client is done,
remote file on server done on client
file is returned to
Remote access model Upload/download model
3 / 17 3 / 17
Example: NFS Architecture

NFS
NFS is implemented using the Virtual File System abstraction, which is
now used for lots of different operating systems.
Client Server
System call layer System call layer
Virtual file system Virtual file system

(VFS) layer (VFS) layer
Local file Local file

system interface NFS client NFS server system interface
RPC client RPC server

stub stub
Network
4 / 17 4 / 17
Example: NFS Architecture
Essence
VFS provides standard file system interface, and allows to hide
difference between accessing local or remote file system.
Question
Is NFS actually a file system?
5 / 17 5 / 17
NFS File Operations

Oper. v3 v4 Description
Create Yes No Create a regular file
Create No Yes Create a nonregular file
Link Yes Yes Create a hard link to a file
Symlink Yes No Create a symbolic link to a file
Mkdir Yes No Create a subdirectory
Mknod Yes No Create a special file
Rename Yes Yes Change the name of a file
Remove Yes Yes Remove a file from a file system
Rmdir Yes No Remove an empty subdirectory
Open No Yes Open a file
Close No Yes Close a file
Lookup Yes Yes Look up a file by means of a name
Readdir Yes Yes Read the entries in a directory
Readlink Yes Yes Read the path name in a symbolic link
Getattr Yes Yes Get the attribute values for a file
Setattr Yes Yes Set one or more file-attribute values
Read Yes Yes Read the data contained in a file
Write Yes Yes Write data to a file
6 / 17 6 / 17
Cluster-Based File Systems

Observation
With very large data collections, following a simple client-server
approach is not going to work for speeding up file accesses, apply
striping techniques by which files can be fetched in parallel.
File block of file a File block of file e
a b c d e
a b c d e
a b c d e
Whole-file distribution
a b a b a b
c e c d c d
d e e
File-striped system
7 / 17 7 / 17
Example: Google File System

file name, chunk index
GFS client Master
contact address
Instructions Chunk-server state
Chunk ID, range

Chunk server Chunk server Chunk server
Chunk data
Linux file Linux file Linux file
system system system
The Google solution

Divide files in large 64 MB chunks, and distribute/replicate chunks across
many servers:
The master maintains only a (file name, chunk server) table in main
memory minimal I/O
Files are replicated using a primary-backup scheme; the master is kept
out of the loop
8 / 17 8 / 17
P2P-based File Systems

Node where a file system is rooted
File system layer Ivy Ivy Ivy
Block-oriented storage DHash DHash DHash
DHT layer Chord Chord Chord
Network
Basic idea
Store data blocks in the underlying P2P system:
Every data block with content D is stored on a node with hash h(D).
Allows for integrity check.
Public-key blocks are signed with associated private key and looked up
with public key.
A local log of file operations to keep track of hblockID, h(D)i pairs.
9 / 17 9 / 17
Distributed File Systems 11.5 Synchronization Distributed File Systems 11.5 Synchronization
File sharing semantics

Client machine #1
Problem
When dealing with distributed file a b
systems, we need to take into account Process
A
the ordering of concurrent read/write a b c
operations and expected semantics 2. Write "c" 1. Read "ab"

(i.e., consistency).
File server
Original file
Single machine a b
a b
Process
A 3. Read gets "ab"
a b c
Client machine #2
Process
a b
B
Process
B
1. Write "c" 2. Read gets "abc"
(a) (b)
10 / 17 10 / 17
File sharing semantics
Semantics
UNIX semantics: a read operation returns the effect of the last
write operation can only be implemented for remote access
models in which there is only a single copy of the file
Transaction semantics: the file system supports transactions on a
single file issue is how to allow concurrent access to a
physically distributed file
Session semantics: the effects of read and write operations are
seen only by the client that has opened (a local copy) of the file
what happens when a file is closed (only one client may actually
win)
11 / 17 11 / 17
Example: File sharing in Coda
Essence
Coda assumes transactional semantics, but without the full-fledged
capabilities of real transactions. Note: Transactional issues reappear in
the form of this ordering could have taken place.
Session S A
Client
Open(RD) File f Invalidate

Close
Server
Close
Open(WR) File f
Client
Time
Session S B
12 / 17 12 / 17
Distributed File Systems 11.6 Consistency and Replication Distributed File Systems 11.6 Consistency and Replication
Consistency and replication
Observation
In modern distributed file systems, client-side caching is the preferred
technique for attaining performance; server-side replication is done for fault
tolerance.
Observation
Clients are allowed to keep (large parts of) a file, and will be notified when
control is withdrawn servers are now generally stateful
1. Client asks for file
Client Server
2. Server delegates file
Old file
Local copy 3. Server recalls delegation
Updated file
4. Client sends returns file
13 / 17 13 / 17
Example: Client-side caching in Coda
Session S A Session SA
Client A
Open(RD) Close Close
Open(RD)
Invalidate
Server File f (callback break) File f
File f OK (no file transfer)
Open(WR)
Open(WR) Close Close
Client B
Time
Session S B Session S B
Note
By making use of transactional semantics, it becomes possible to
further improve performance.
14 / 17 14 / 17
Example: Server-side replication in Coda
Server Server
S1 S3
Client Broken Client

Server
A network B
S2
Main issue
Ensure that concurrent updates are detected:
Each client has an Accessible Volume Storage Group (AVSG): is a
subset of the actual VSG.
Version vector CVVi (f )[j] = k Si knows that Sj has seen version k of f .
Example: A updates f S1 = S2 = [+1, +1, +0]; B updates
f S3 = [+0, +0, +1].
15 / 17 15 / 17
Distributed File Systems 11.7 Fault Tolerance Distributed File Systems 11.7 Fault Tolerance
High availability in P2P systems
Problem
There are many fully decentralized file-sharing systems, but because
churn is high (i.e., nodes come and go all the time), we may face an
availability problem replicate files all over the place (replication
factor: rrep ).
Alternative
Apply erasure coding:
Partition a file F into m fragments, and recode into a collection F
of n > m fragments
Property: any m fragments from F are sufficient to reconstruct F .
Replication factor: rec = n/m
16 / 17 16 / 17
Distributed File Systems 11.7 Fault Tolerance Distributed File Systems 11.7 Fault Tolerance
Replication vs. erasure coding
Comparison
With an average node availability a, 2.2
rrep
and required file unavailability , we rec 2.0
have for erasure coding:
1.8
rec m
rec m i
1 = i
a (1 a)rec mi 1.6
i =m
1.4
and for file replication: 0.2 0.4 0.6 0.8 1
Node availability
1 = 1 (1 a)rrep
17 / 17 17 / 17

Chapter 11: Distributed File Systems

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Chapter 11: Distributed File Systems

Caricato da

Copyright:

Formati disponibili

Distributed Systems

Principles and Paradigms

Maarten van Steen

VU Amsterdam, Dept. Computer Science

Chapter 11: Distributed File Systems

Distributed File Systems

1. File moved to client

Remote access model Upload/download model

Example: NFS Architecture

System call layer System call layer

Virtual file system Virtual file system

Local file Local file

RPC client RPC server

Example: NFS Architecture

NFS File Operations

Cluster-Based File Systems

File block of file a File block of file e

Example: Google File System

Instructions Chunk-server state

Chunk ID, range

The Google solution

P2P-based File Systems

File system layer Ivy Ivy Ivy

Block-oriented storage DHash DHash DHash

DHT layer Chord Chord Chord

File sharing semantics

operations and expected semantics 2. Write "c" 1. Read "ab"

File sharing semantics

Example: File sharing in Coda

Open(RD) File f Invalidate

Consistency and replication

Local copy 3. Server recalls delegation

Example: Client-side caching in Coda

File f OK (no file transfer)

Example: Server-side replication in Coda

Client Broken Client

High availability in P2P systems

Replication vs. erasure coding

Potrebbero piacerti anche