DISTRIBUTED FILE SYSTEMS
an introduction to general DFS
DEFINITIONS:
• A traditional Distributed File System ( DFS ) is simply a classical model of a file
system distributed across multiple machines. The purpose is to promote sharing of
dispersed files with good transparency to users.
• The resources on a particular machine are local to itself. Resources on other
machines are remote.
• A file system provides a service for clients. The server interface is the normal set of file
operations: create, read, etc. on files.
1
DISTRIBUTED FILE Definitions
SYSTEMS
Clients, servers, and storage are dispersed across machines. Configuration and
implementation may vary -
a) Servers may run on dedicated machines, OR
b) Servers and clients can be on the same machines.
c) The OS itself can be distributed (with the file system a part of that distribution.
d) A distribution layer can be interposed between a conventional OS and the file
system.
Clients should view a DFS the same way they would a centralized FS; the distribution is
hidden at a lower level.
Performance is concerned with throughput and response time.
2
DISTRIBUTED FILE
SYSTEMS Naming and Transparency
Naming is the mapping between logical and physical objects.
• In a conventional file system, it's understood where the file actually resides; the
system and disk are known.
• In a transparent DFS, the location of a file, somewhere in the network, is hidden.
Location transparency -
The name of a file does not reveal any hint of the file's physical storage location
(machine, disk, or disk blocks).
3
DISTRIBUTED FILE
SYSTEMS Naming and Transparency
Location independence -
• The name of a file doesn't need to be changed when the file's physical storage
location changes. Dynamic, one-to-many mapping.
• Separates the naming hierarchy from the storage devices hierarchy.
Most DFSs today:
• Support location transparent systems.
4
DISTRIBUTED FILE
SYSTEMS Naming and Transparency
NAMING SCHEMES:
1. Remote directories are mounted to local directories.
• So a local system seems to have a coherent directory structure.
• The remote directories must be explicitly mounted. The files are location
independent.
• SUN NFS is a good example of this technique.
2. A single global name structure spans all the files in the system.
• The DFS is built the same way as a local filesystem. Location independent.
• GFS and HDFS work in this way
5
DISTRIBUTED FILE
SYSTEMS Naming and Transparency
IMPLEMENTATION TECHNIQUES:
• Can Map directories or larger aggregates rather than individual files.
• A non-transparent mapping technique:
name ----> < system, disk, cylinder, sector >
• A transparent mapping technique:
name ----> file_identifier ----> < system, disk, cylinder, sector >
• So when changing the physical location of a file, only the file identifier need be
modified. This identifier must be "unique" in the universe.
6
DISTRIBUTED FILE
SYSTEMS Remote File Access
CACHING
• Reduce network traffic by retaining recently accessed disk blocks in a cache, so that
repeated accesses to the same information can be handled locally.
• If required data is not already cached, a copy of data is brought from the server to the
user.
• Perform accesses on the cached copy.
• Files are identified with one master copy residing at the server machine,
• Copies of (parts of) the file are scattered in different caches.
Cache Consistency Problem -- Keeping the cached copies consistent with the master file.
7
DISTRIBUTED FILE
SYSTEMS Remote File Access
CACHE UPDATE POLICY:
• A write through cache has good reliability. But the user must wait for writes to get to the
server. Used by NFS.
• Delayed write - write requests complete more rapidly. Data may be written over the
previous cache write, saving a remote write. Poor reliability on a crash.
8
DISTRIBUTED FILE
SYSTEMS Remote File Access
FILE REPLICATION:
• Duplicating files on multiple machines improves availability and performance.
• Placed on failure-independent machines ( they won't fail together ).
Replication management should be "location-opaque".
• The main problem is consistency - when one copy changes, how do other copies reflect
that change? Often there is a tradeoff: consistency versus availability and performance.
• Atomic and serialized invalidation isn't guaranteed ( message could get lost / machine
could crash. )
9
Example: SUN Network File System
OVERVIEW:
• Runs on SUNOS - NFS is both an implementation and a specification of how to access
remote files. It's both a definition and a specific instance.
• The goal: to share a file system in a transparent way.
• Uses client-server model ( for NFS, a node can be both simultaneously.) Can act
between any two nodes ( no dedicated server. ) Mount makes a server file-system visible
from a client.
mount server:/usr/shared client:/usr/local
• Then, transparently, a request for /usr/local/dir-server accesses a file that is on the
server.
• The mount is controlled by: (1) access rights, (2) server specification of what's
mountable.
• Can use heterogeneous machines - different hardware, operating systems, network
protocols.
• Uses RPC for isolation - thus all implementations must have the same RPC calls. These
RPC's implement the mount protocol and the NFS protocol.
17: Distributed File Systems 10
DISTRIBUTED FILE
SYSTEMS SUN Network File System
THE MOUNT PROTOCOL:
The following operations occur:
1. The client's request is sent via RPC to the mount server ( on server machine.)
2. Mount server checks export list containing
a) file systems that can be exported,
b) legal requesting clients.
c) It's legitimate to mount any directory within the legal filesystem.
3. Server returns "file handle" to client.
4. Server maintains list of clients and mounted directories -- this is state information!
But this data is only a "hint" and isn't treated as essential.
5. Mounting often occurs automatically when client or server boots.
17: Distributed File Systems 11
DISTRIBUTED FILE
SYSTEMS SUN Network File System
THE NFS PROTOCOL:
RPC’s support these remote file operations:
a) Search for file within directory.
b) Read a set of directory entries.
c) Manipulate links and directories.
d) Read/write file attributes.
e) Read/write file data.
Note:
• Open and close are conspicuously absent from this list. NFS servers are stateless.
Each request must provide all information. With a server crash, no information is lost.
• Modified data must actually get to server disk before client is informed the action is
complete. Using a cache would imply state information.
• A single NFS write is atomic. A client write request may be broken into several atomic
RPC calls, so the whole thing is NOT atomic. Since lock management is stateful, NFS
doesn't do it. A higher level must provide this service.
17: Distributed File Systems 12
DISTRIBUTED FILE
SYSTEMS SUN Network File System
NFS ARCHITECTURE:
Follow local and remote access through this figure:
17: Distributed File Systems 13
DISTRIBUTED FILE
SYSTEMS SUN Network File System
NFS ARCHITECTURE:
1. UNIX filesystem layer - does normal open / read / etc. commands.
2. Virtual file system ( VFS ) layer -
a) Gives clean layer between user and filesystem.
b) Acts as deflection point by using global vnodes.
c) Understands the difference between local and remote names.
d) Keeps in memory information about what should be deflected (mounted
directories) and how to get to these remote directories.
3. System call interface layer -
a) Presents sanitized validated requests in a uniform way to the VFS.
17: Distributed File Systems 14
DISTRIBUTED FILE
SYSTEMS SUN Network File System
PATH-NAME TRANSLATION:
• Break the complete pathname into components.
• For each component, do an NFS lookup using the
component name + directory vnode.
• After a mount point is reached, each component piece will cause a server access.
• Can't hand the whole operation to server since the client may have a second mount on a
subsidiary directory (a mount on a mount ).
• A directory name cache on the client speeds up lookups.
17: Distributed File Systems 15
DISTRIBUTED FILE
SYSTEMS SUN Network File System
CACHES OF REMOTE DATA:
• The client keeps:
File block cache - ( the contents of a file )
File attribute cache - ( file header info (inode in UNIX) ).
• The local kernel hangs on to the data after getting it the first time.
• On an open, local kernel, it checks with server that cached data is still OK.
• Cached attributes are thrown away after a few seconds.
• Data blocks use read ahead and delayed write.
• Mechanism has:
Server consistency problems.
Good performance.
17: Distributed File Systems 16