Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
DEFINITIONS:
A file system provides a service for clients. The server interface is the normal set of file
operations: create, read, etc. on files.
1
Introduction
Distributed file systems support the sharing of
information in the form of files throughout the
intranet.
A distributed file system enables programs to
store and access remote files exactly as they
do on local ones, allowing users to access
files from any computer on the intranet.
Recent advances in higher bandwidth
connectivity of switched local networks and
disk organization have lead high performance
and highly scalable file systems.
2
DISTRIBUTED FILE Definitions
SYSTEMS
Clients, servers, and storage are dispersed across machines.
Configuration and implementation may vary -
Clients should view a DFS the same way they would a centralized FS; the
distribution is hidden at a lower level.
In a conventional file system, it's understood where the file actually resides;
the system and disk are known.
Location transparency -
a) The name of a file does not reveal any hint of the file's physical storage
location.
b) File name still denotes a specific, although hidden, set of physical disk
4
blocks.
DISTRIBUTED FILE
SYSTEMS Naming and Transparency
The ANDREW DFS AS AN EXAMPLE:
Is location independent.
Supports file mobility.
Separation of FS and OS allows for disk-less systems. These have lower
cost and convenient system upgrades. The performance is not as good.
NAMING SCHEMES:
Same naming works on local and remote files. The DFS is a loose
collection of independent file systems. 5
DISTRIBUTED FILE
SYSTEMS Naming and Transparency
NAMING SCHEMES:
3. A single global name structure spans all the files in the system.
IMPLEMENTATION TECHNIQUES:
name ----> file_identifier ----> < system, disk, cylinder, sector >
7
DISTRIBUTED FILE
SYSTEMS Remote File Access
CACHING
Caching is a mechanism for maintaining disk data on the local machine. This data can be
kept in the local memory or in the local disk. Caching can be advantageous both for read
ahead and read again.
The cost of getting data from a cache is a few HUNDRED instructions; disk accesses cost
THOUSANDS of instructions.
The master copy of a file doesn't move, but caches contain replicas of portions of the file.
Caching behaves just like "networked virtual memory".
What should be cached? << blocks <---> files >>. Bigger sizes give a better hit rate;
smaller give better transfer times.
Caching on disk gives:
Better reliability.
9
DISTRIBUTED FILE
SYSTEMS Remote File Access
10
DISTRIBUTED FILE
SYSTEMS Remote File Access
12
DISTRIBUTED FILE
SYSTEMS Remote File Access
FILE REPLICATION:
13
General File Service
Architecture
The responsibilities of a DFS are typically
distributed among three modules:
Client module which emulates the conventional
file system interface
Server modules(2) which perform operations for
clients on directories and on files.
Client module
15
File Service Architecture
Flat File Service:
Concerned with implementing operations on
the concepts of files.
Client Module:
Run on each client computer
Integrate and expand the operations of the flat
file service under single application
programming interface. 17
What is NFS?
First commercially successful network
file system:
Developed by Sun Microsystems for their
diskless workstations
Designed for robustness and adequate
performance
Sun published all protocol specifications
Many many implementations
18
DISTRIBUTED FILE
SYSTEMS SUN Network File System
OVERVIEW:
19
highlights
NFS is stateless
All client requests must be self-contained
The virtual filesystem interface
VFS operations
VNODE operations
Performance issues
Impact of tuning on NFS performance
20
Objectives (I)
Machine and Operating System
Independence
Could be implemented on low-end machines
of the mid-80s
Fast Crash Recovery
Major reason behind stateless design
Transparent Access
Remote files should be accessed in exactly
the same way as local files 21
Objectives (II)
UNIX semantics should be
maintained on client
Best way to achieve transparent access
Reasonable performance
Robustness and preservation of UNIX
semantics were much more important
22
Basic design
Three important parts
The protocol
The server side
The client side
23
The protocol (I)
Uses the Sun RPC mechanism and Sun
eXternal Data Representation (XDR)
standard
Protocol is stateless
Each procedure call contains all the
information necessary to complete the call
24
Advantages of statelessness
Crash recovery is very easy:
When a server crashes, client just resends
request until it gets an answer from the
rebooted server
Client cannot tell difference between a
server that has crashed and recovered and
a slow server
Client can always repeat any request
25
Consequences of
statelessness
Read and writes must specify their start offset
Server does not keep track of current position in
the file
User still use conventional UNIX reads and writes
26
Server side (II)
File handle consists of
Filesystem id identifying disk partition
I-node number identifying file within
partition
Generation number changed every time
i-node is reused to store a new file
Server will store
Filesystem id in filesystem superblock
I-node generation number in i-node 27
Client side (I)
Provides transparent interface to NFS
Mapping between remote file names
and remote file addresses is done a
server boot time through remote
mount
Extension of UNIX mounts
Specified in a mount table
Makes a remote subtree appear part of a
local subtree 28
Remote mount
Client tree
/
Server subtree
usr
rmount
bin
RPC/XDR disk
LAN 31
The Mount Protocol
The mount protocol provides four basic services
that clients need before they can use NFS:
It allows the client to obtain a list of the directory
hierarchies (i.e. the file systems) that the client can
access through NFS.
It accepts full path names That allow the client to identify
a particular directory hierarchy.
It authenticates each clients request and validates the
clients permission to access the requested hierarchy.
It returns a file handle for the root directory of the
hierarchy a client specifies.
The client uses the root handle obtained from the
mount protocol when making NFS calls.
32
DISTRIBUTED FILE
SYSTEMS SUN Network File System
1. The client's request is sent via RPC to the mount server ( on server machine.)
4. Server maintains list of clients and mounted directories -- this is state information!
But this data is only a "hint" and isn't treated as essential.
33
DISTRIBUTED FILE
SYSTEMS SUN Network File System
THE NFS PROTOCOL:
Note:
NFS servers are stateless. Each request must provide all information. With a server
crash, no information is lost.
Modified data must actually get to server disk before client is informed the action is
complete. Using a cache would imply state information.
A single NFS write is atomic. A client write request may be broken into several atomic
RPC calls, so the whole thing is NOT atomic.
34
DISTRIBUTED FILE
SYSTEMS SUN Network File System
NFS ARCHITECTURE:
35
DISTRIBUTED FILE
SYSTEMS SUN Network File System
NFS ARCHITECTURE:
36
DISTRIBUTED FILE
SYSTEMS SUN Network File System
CACHES OF REMOTE DATA:
The local kernel hangs on to the data after getting it the first time.
On an open, local kernel, it checks with server that cached data is still
OK.
37
NFS solution (I)
Stateless server does not know how
many users are accessing a given file
Clients do not know either
Clients must
Frequently send their modified blocks to
the server
Frequently ask the server to revalidate the
blocks they have in their cache
38
Hard issues (I)
NFS root file systems cannot be shared:
Too many problems
Clients can mount any remote subtree
any way they want:
Could have different names for same
subtree by mounting it in different places
NFS uses a set of basic mounted
filesystems on each machine and let users
do the rest 39
Hard issues (II)
NFS passes user id, group id and
groups on each call
Requires same mapping from user id and
group id to user on all machines
40
Conclusion
To allow many clients to access a server and to keep
the servers isolated from client crashes, NFS uses
stateless servers.
43