CSE352: Parallel and Distributed Systems
Distributed File Systems (DFS)
Learning objectives
Understand the requirements that affect the design of
distributed services
Understand how a relatively simple, widely-used service
(NFS) is designed
Obtain a knowledge of file systems, both local and
networked
Caching as an essential design technique
Security requires special consideration
Recent advances in Distributed File Systems
2
Introduction
Why do we need a DFS?
Primary purpose of a Distributed System…
Connecting users and resources
Resources…
… can be inherently distributed
… can actually be data (files, databases, …) and…
… their availability becomes a crucial issue for the
performance of a Distributed System
3
Introduction
A Case for DS
Server A
I need to
I want to store store my
my thesis on analysis
the server!
and reports
I need to have I need safely…
my book always storage for My boss
available..
my reports
wants…
4
Introduction
A Case for DS
Server A
Same here…
Server B
I don’t
remember..
Server C
Hey… but
where did I
put my docs?
Wow… now I can
store a lot more
documents…
I am not sure whether
server A, or B, or C…
5
Introduction
A Case for DS
Server B
Server A
Server C
Distributed File System
Nice… my Good… I can access
boss will Wow! I do not have my folders from
promote me!
to remember which anywhere..
server I stored the
data into…
6
Storage Systems and Their Properties
In first generation of distributed 1974 - 1995
systems (1974-95), file systems (e.g.
NFS) were the only networked storage
systems.
1995 - 2007
With the advent of distributed object
systems (CORBA, Java) and the web,
the picture has become more complex.
2007 - now
Current focus is on large scale, scalable
storage.
Google File System
Amazon S3 (Simple Storage Service)
7
What is a File System?
Persistent stored data sets
Hierarchic name space visible to all processes
API with the following characteristics:
access and update operations on persistently stored data sets
Sequential access model (with additional random facilities)
Sharing of data between users, with access control
Concurrent access:
certainly for read-only access
what about updates?
Other features:
mountable file stores
more? ...
8
What is a File System?
UNIX file system operations
filedes = open(name, mode)
Opens an existing file with the given name
.
filedes = creat(name, mode)
Creates a new file with the given name
.
Both operations deliver a file descriptor referencing the open
file. The mode is read, write or both.
status = close(filedes)
Closes the open file filedes.
count = read(filedes, buffer, n)
Transfers n bytes from the file referenced by filedes to buffer
.
count = write(filedes, buffer, n)
Transfers n bytes to the file referenced by filedes from buffer.
Both operations deliver the number of bytes actually transferred
and advance the read-write pointer.
pos = lseek(filedes, offset,
Moves the read-write pointer to offset (relative or absolute,
whence)
depending on whence).
status = unlink(name)
Removes the file name from the directory structure. If the file
has no other names, it is deleted.
status = link(name1, name2)
Adds a new name (name2) for a file (name1
).
status = stat(name, buffer)
Gets the file attributes for file name into buffer.
9
What is a File System?
File attribute record structure
File length
Creation timestamp
updated
Read timestamp
by system:
Write timestamp
Attribute timestamp
Reference count
Owner
File type
updated
by owner:
Access control list
E.g. for UNIX: rw-rw-r--
10
File Service Architecture
An architecture that offers a clear separation of the
main concerns in providing access to files is
obtained by structuring the file service as three
components:
mawgood fel file system (centralized or distributed) A flat
file service
Deal with blocks of the file system in flat way.
A directory service
Deal with files in heighracial way as folders.
A client module.
The Client module implements exported interfaces
by flat file and directory services on server side.
11
Model File Service Architecture
Lookup
AddName
UnName
Client computer
Server computer
GetNames
Application
Application
Directory service
program
program
Flat file service
Client module
Read
Write
Create
Delete
GetAttributes
SetAttributes
12
Responsibilities of Various Modules
Flat file Service
Concerned with the implementation of operations on the contents of file. Unique
File Identifiers (UFIDs) are used to refer to files in all requests for flat file service
operations. UFIDs are long sequences of bits chosen so that each file has a
unique ID among all of the files in a distributed system.
Directory Service
Provides mapping between text names for the files and their UFIDs. Clients may
obtain the UFID of a file by quoting its text name to directory service. Directory
service supports functions needed to generate directories and to add new files
to directories.
Client Module
It runs on each computer and provides integrated service (flat file and directory)
as a single API to application programs. For example, in UNIX hosts, a client
module emulates the full set of UNIX file operations.
It holds information about the network locations of flat-file and directory server
processes; and achieve better performance through implementation of a cache
of recently used file blocks at the client.
13
Server operations/interfaces for the model
file service
Flat file service
position of first byte
Directory service
Read(FileId, i, n) -> Data
Lookup(Dir, Name) -> FileId
position of first byte
Write(FileId, i, Data)
AddName(Dir, Name, File)
Create() -> FileId
UnName(Dir, Name)
Delete(FileId)
GetNames(Dir, Pattern) ->
GetAttributes(FileId) -> Attr
NameSeq
FileId
SetAttributes(FileId, Attr)
A unique identifier for files anywhere in the
network.
14
File Group
A collection of files that can be
located on any server or moved To construct a globally unique
between servers while ID we use some unique
maintaining the same names.
attribute of the machine on
which it is created, e.g. IP
Similar to a UNIX filesystem
address, even though the file
Helps with distributing the load of group may move subsequently.
file serving between several
servers.
File groups have identifiers which
File Group ID:
are unique throughout the system
32 bits
16 bits
(and hence for an open system,
they must be globally unique).
IP address
date
Used to refer to file groups and
files
15
Case Study: Sun (Oracle) NFS
An industry standard for file sharing on local networks since the 1980s
An open standard with clear and simple interfaces
Supports many of the design requirements already mentioned:
transparency
heterogeneity
efficiency
fault tolerance
Limited achievement of:
concurrency
replication
consistency
security
16
NFS Architecture
Client computer
Client computer
NFS
Server computer
Application
program
Client
Application
Application
Application
Kernel
program
program
program
UNIX
system calls
UNIX kernel
Virtual file system
Virtual file system
Operations Operations
on local files
on
file system
remote files
UNIX
NFS
UNIX
NFS
NFS
Client
file
file
Other
client
server
system
system
NFS
protocol
(remote operations)
17
NFS Architecture
Does the implementation have to be in the system kernel?
No:
there are examples of NFS clients and servers that run at
application-level as libraries or processes (e.g. early Windows and
MacOS implementations)
But, for a UNIX implementation there are advantages:
Binary code compatible - no need to recompile applications
Standard system calls that access remote files can be routed
through the NFS client module by the kernel
Shared cache of recently-used blocks at client
Kernel-level server can access i-nodes and file blocks directly
Security of the encryption key used for authentication
18
NFS Server Operations
• read(fh, offset, count) -> attr, data
fh = fileModel flat file service
handle:
• write(fh, offset, count, data) -> attr
Read(FileId, i, n) -> Data
Filesystem identifier
i-node number
i-node generation
•
create(dirfh, name, attr) -> newfh, attr
Write(FileId, i, Data)
• remove(dirfh, name) status
Create() -> FileId
• getattr(fh) -> attr
Delete(FileId)
• setattr(fh, attr) -> attr
GetAttributes(FileId) -> Attr
• lookup(dirfh, name) -> fh, attr
SetAttributes(FileId, Attr)
• rename(dirfh, name, todirfh, toname)
Model directory service
• link(newdirfh, newname, dirfh, name)
Lookup(Dir, Name) -> FileId
• readdir(dirfh, cookie, count) -> entries
AddName(Dir, Name, File)
• symlink(newdirfh, newname, string) -> UnName(Dir, Name)
status
GetNames(Dir, Pattern)
• readlink(fh) -> string
->NameSeq
• mkdir(dirfh, name, attr) -> newfh, attr
• rmdir(dirfh, name) -> status
• statfs(fh) -> fsstats
19
NFS Access Control and Authentication
Stateless server, so the user's identity and access rights must
be checked by the server on each request.
Every client request is accompanied by the userID and groupID
Kerberos has been integrated with NFS to provide a stronger
and more comprehensive security solution
20
Architecture Components
(UNIX / Linux)
Server:
nfsd: NFS server daemon that services requests from
clients.
mountd: NFS mount daemon that carries out the mount
request passed on by nfsd.
rpcbind: RPC port mapper used to locate the nfsd
daemon.
/etc/exports: configuration file that defines which portion
of the file systems are exported through NFS and how.
Client:
mount: standard file system mount command.
/etc/fstab: file system table file.
nfsiod: (optional) local asynchronous NFS I/O server.
21
Mount Service
Mount operation:
mount(remotehost, remotedirectory,
localdirectory)
Server maintains a table of clients who have
mounted filesystems at that server
Each client maintains a table of mounted file
systems holding:
< IP address, port number, file handle>
22
Local and Remote File Systems Accessible on an
NFS Client
Server 1 Client Server 2
(root) (root) (root)
export ... vmunix usr nfs
Remote Remote
people students x staff users
mount mount
big jon bob ... jim ann jane joe
23
Automounter
NFS client catches attempts to access 'empty' mount points
and routes them to the Automounter
Automounter has a table of mount points and multiple
candidate servers for each
it sends a probe message to each candidate server and
then uses the mount service to mount the filesystem at
the first server to respond
Keeps the mount table small
Provides a simple form of replication for read-only
filesystems
E.g. if there are several servers with identical copies of
/usr/lib then each server will have a chance of being
mounted at some clients.
24
NFS Optimization - Server Caching
Similar to UNIX file caching for local files
pages (blocks) from disk are held in a main memory buffer cache until the space
is required for newer pages. Read-ahead and delayed-write optimizations.
For local files, writes are deferred to next sync event (30 second intervals)
Works well in local context, where files are always accessed through the local
cache, but in the remote case it doesn't offer necessary synchronization
guarantees to clients.
NFS v3 servers offers two strategies for updating the disk
write-through - altered pages are written to disk as soon as they are received at
the server.
delayed commit - pages are held only in the cache until a commit() call is
received for the relevant file. This is the default mode used by NFS v3 clients. A
commit() is issued by the client whenever a file is closed.
25
NFS Optimization - Client Caching
Server caching does nothing to reduce the traffic between
client and server
further optimization is essential to reduce server load in large networks
NFS client module caches the results of read, write, getattr, lookup and readdir
operations
synchronization of file contents (one-copy semantics) is not guaranteed when
two or more clients are sharing the same file.
Timestamp-based validity check
reduces inconsistency, but doesn't eliminate it
validity condition for cache entries at the client:
t freshness guarantee
Tc time when cache entry was
(T - Tc < t) v (Tmclient = Tmserver)
last validated
t is configurable (per file) but is typically set to Tm time when block was last
3 seconds for files and 30 secs. for directories
updated
it remains difficult to write distributed
T current time
applications that share files with NFS
26
Summary
Distributed File systems provide illusion of a local file system
and hide complexity from end users.
Sun NFS is an excellent example of a distributed service
designed to meet many important design requirements
Effective client caching can produce file service performance
equal to or better than local file systems
Consistency versus update semantics versus fault tolerance
remains an issue
Most client and server failures can be masked
27