0% found this document useful (0 votes)

22 views63 pages

05 en Distributed File Systems

Uploaded by

yagcifuad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views63 pages

05 en Distributed File Systems

Uploaded by

yagcifuad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 63

BASICS OF DISTRIBUTED SYSTEMS

DISTRIBUTED FILE SYSTEMS

GOOGLE CHUBBY
(≈ APACHE ZOOKEEPER)
CHUBBY

 Distributed lock service + simple fault-tolerant file system

 Interfaces
 File access
 Event notification
 File locking
 Chubby is used to:
 Manage coarse-grained, long-term locks (hours or days, not < sec)
 get/release/check lock – identified with a name
 Store small amounts of data associated with a name
 E.g., system configuration info, identification of primary coordinators
 Elect masters
 Design priority: availability rather than performance
CHUBBY DEPLOYMENT
 Client library + a Chubby cell (5 replica servers)
CHUBBY: DESIGN DECISIONS

 Lock services for distributed systems

 Read-write small files
 File access are controlled with sessions
 Files are locally cacheable
 Locks are coarse – grained
 Strict consistency
CHUBBY MASTER

 Chubby has at most one master

 All requests from the client go to the master

 All other nodes (replicas) must agree on who the master is

 Paxos consensus protocol used to elect a master
 Master gets a lease time
 Re-run master selection after lease time expires to extend the lease or if the master fails
 When a Chubby node receives a proposal for a new master
 It will accept it only if the old master’s lease expired
SIMPLE USER-LEVEL API FOR CHUBBY

 User-level RPC interface

 Not implemented under VFS
 Programs must access Chubby via an API

 Look up Chubby nodes via DNS

 Ask any Chubby node for the master node
 File system interface (names, content, and locks)
CHUBBY: FILE SYSTEM INTERFACE

 Each file has

 Name
 Data
 Access control list
 Lock
 No modification, access times
 No seek or partial reads/writes; no symbolic links; no moves
CHUBBY: API

function Description
open() Set mode: read, write & lock, change ACL,
event list, lock-delay, create
close()
GetContentsAndStat() Read file contents & metadata
SetContents(), SetACL() Write file contents or ACL
Delete()
Acquire(), TryAcquire(), Release() Lock operations
GetSequencer() Sequence # for a lock
SetSequencer() Associate a sequencer with a file handle
CheckSequencer() Check if sequencer is valid
CHUBBY: LOCKS
CHUBBY: LOCKS

 Every file & directory can act as a reader-writer lock

 Either one client can hold an exclusive (writer) lock
 Or multiple clients can hold reader locks

 Locks are advisory

 If a client releases a lock, the lock is immediately available
 If a client fails, the lock will be unavailable for a lock-delay period (typically 1
minute)
USING LOCKS FOR LEADER ELECTION

 Using Chubby locks makes leader election easy

 No need for user servers to participate in a consensus protocol … the programmer doesn't need to figure out Paxos (or Raft)
 Chubby provides the fault tolerance
 Participant tries to acquire a lock
 If it gets it, then it’s the master for whatever service it's providing!

 Example: electing a master & using it to write to a file server

 Participant gets a lock, becomes master (for its service, not Chubby)
 Gets a lock sequence count
 In each RPC to a server, send the sequence count to the server
 During request processing, a server will reject old (delayed) packets
if (sequence_count < current_sequence_count)
reject request /* it must be from a delayed packet */
SESSIONS

 Client requests for session

 Master grants it
 Client sends “KeepAlive” messages
 If no “KeepAlive” message is
received by master – the session is
closed
EVENTS

 Clients may subscribe to events:

 File content modifications
 Child node added/removed/modified
 Chubby master failed over
 File handle & its lock became invalid
 Lock acquired
 Conflicting lock request from another client
CHUBBY CLIENT CACHE

 Master writes to client cache

 Client works with cache
 If changed were made, master
invalidates the cache
 Ifso, client communicates with
master
CHUBBY CLIENT CACHING & MASTER REPLICATION

 At the client
 Data cached in memory by chubby clients
 Cache is maintained by a Chubby lease, which can be invalidated
 All clients write through to the Chubby master
 At the master
 Writes are propagated via Paxos consensus to all Chubby replicas
 Data updated in total order – replicas remain synchronized
 The master replies to a client after the writes reach a majority of replicas
 Cache invalidations
 Master keeps a list of what each client may be caching
 Invalidations sent by master and are acknowledged by client
 File is then cacheable again
 Chubby database is backed up to GFS every few hours
APACHE ZOOKEEPER
APACHE ZOOKEEPER

 Similar to Chubby
 Different naming
 Writes via Leader
 Read from any replica

 Only 4 events/watches:
 Central servers
 Point of congestion, single point of failure
NEED OF PARALLEL
 Alleviate somewhat with replication and client caching
FILE SYSTEMS
 E.g., Coda, tokens, (aka leases, oplocks)
Client-server file systems  Limited replication can lead to congestion

 File data is still centralized

 A file server stores all data from a file – not split across
servers
 Even if replication is in place, a client downloads all data for
a file from one server
 File sizes are limited to the capacity available on a server
 What if you need a 1,000 TB file?
GOOGLE FILE SYSTEM
GFS
(≈ APACHE HADOOP DISTRIBUTED FILE SYSTEM)
GFS

 Scalable distributed file system

 Designed for large data-intensive applications
 Fault-tolerant; runs on commodity hardware
 Delivers high performance to a large number of clients
DESIGN ASSUMPTIONS

 Assumptions for conventional file systems don’t work

 E.g., “most files are small”, “lots have short lifetimes”

 Component failures are the norm, not an exception

 File system = thousands of storage machines
 Some % not working at any given time

 Files are huge. Multi-TB files are the norm

 It doesn’t make sense to work with billions of nKB-sized files
 I/O operations and block size choices are also affected
DESIGN ASSUMPTIONS

 File access:
 Most files are appended, not overwritten
 Random writes within a file are almost never done
 Once created, files are mostly read; often sequentially
 Workload is mostly:
 Reads: large streaming reads, small random reads – these dominate
 Large appends
 Hundreds of processes may append to a file concurrently

 GFS will store a modest number of files for its scale

 approx. a few million
 Designing the GFS API with the design of apps benefits the system
 Apps can handle a relaxed consistency model
WHAT IS A PARALLEL FILE SYSTEM?

 Conventional file systems

 Store data & metadata on the same storage device
 Example:
 Linux directories are just files that contain lists of names & inodes
 inodes are data structures placed in well-defined areas of the disk that contain information about the file
 Lists of block numbers containing file data are allocated from the same set of data blocks used for file data

 Parallel file systems:

 File data can span multiple servers
 Metadata can be on separate servers
 Metadata = information about the file
 Includes name, access permissions, timestamps, file size, & locations of data blocks
 Data = actual file contents
BASIC DESIGN IDEA

 Use separate servers to store metadata

 Metadata includes lists of (server, block_number) sets that identify which blocks on which servers hold file data
 We need more bandwidth for data access than metadata access
 Metadata is small; file data can be huge

 Use large logical blocks

 Most "normal" file systems are optimized for small files
 A block size is typically 4KB
 Expect huge files, so use huge blocks … >1,000x larger
 The list of blocks that makes up a file becomes easier to manage

 Replicate data
 Expect some servers to be down
 Store copies of data blocks on multiple servers
FILE SYSTEM INTERFACE

 GFS does not have a standard OS-level API

 No POSIX system call level API – no kernel/VFS implementation
 User-level API for accessing files
 GFS servers are implemented in user space using native Linux FS
 Files organized hierarchically in directories
 Operations
 Basic operations
 Create, delete, open, close, read, write
 Additional operations
 Snapshot: create a copy of a file or directory tree at low cost
 Append: allow multiple clients to append atomically without locking
GFS MASTER & CHUNKSERVERS

 GFS cluster
 Multiple chunkservers
 Data storage: fixed-size chunks
 Chunks replicated on several systems

 One master
 Stores file system metadata (names, attributes)
 Maps files to chunks
GFS MASTER & CHUNKSERVERS

 GFC cluster
GFS FILES
CORE PART OF GOOGLE CLUSTER ENVIRONMENT
 Google Cluster Environment
 Core services: GFS + cluster scheduling system
 Typically 100s to 1000s of active jobs
 200+ clusters, many with 1000s of machines
 Pools of 1000s of clients
 4+ PB filesystems, 40 GB/s read/write loads
CHUNKS AND CHUNKSERVERS

 Chunk size = 64 MB (default)

 Chunkserver stores a 32-bit checksum with each chunk
 In memory & logged to disk: allows it to detect data corruption

 Chunk Handle: identifies a chunk

 Globally unique 64-bit number
 Assigned by the master when the chunk is created
 Chunkservers store chunks on local disks as Linux files
 Each chunk is replicated on multiple chunkservers
 Three replicas (different levels can be specified)
 Popular files may need more replicas to avoid hotspots
MASTER

 Maintains all file system metadata

 Namespace
 Access control info
 Filename to chunks mappings
 Current locations of chunks

 Manages
 Chunk leases (locks)
 Garbage collection (freeing unused chunks)
 Chunk migration (copying/moving chunks)

 Fault tolerance
 Operation log replicated on multiple machines
 New master can be started if the master fails

 Periodically communicates with all chunkservers

 Via heartbeat messages to get state and send commands
CLIENT INTERACTION MODEL

 GFS client code linked into each app

 No OS-level API – you have to use a library
 Interacts with master for metadata-related operations
 Interacts directly with chunkservers for file data
 All reads & writes go directly to chunkservers
 Master is not a point of congestion

 Neither clients nor chunkservers cache data

 Except for the caching by the OS system buffer cache
 Clients cache metadata
 E.g., location of a file’s chunks
ONE MASTER = SIMPLIFIED DESIGN

 All metadata stored in master’s memory

 Super-fast access
 Namespaces and name-to-chunk_list maps
 Stored in memory
 Also persist in an operation log on the disk
 Replicated onto remote machines for backup

 Operation log
 Similar to a journal
 All operations are logged
 Periodic checkpoints (stored in a B-tree) to avoid playing back entire log
 Master does not store chunk locations persistently
 This is queried from all the chunkservers: avoids consistency problems
OPERATION LOGS

 All operations logged

 If Master crashes, after
refreshing it reads operations
log

 Additional shadow Master can

be used as backup
WHY LARGE CHUNKS?

 Default chunk size = 64MB

 (compare to Linux ext4 block sizes: typically 4 KB and up to 1 MB)
 Reduces need for frequent communication with master to get chunk location info
– one query can give info on location of lots of bytes of data
 Clients can easily cache info to refer to all data of large files
 Cached data has timeouts to reduce possibility of reading stale data

 Large chunk makes it feasible to keep a TCP connection open to a chunkserver for
an extended time
 Master stores <64 bytes of metadata for each 64MB chunk
READ

 Contact the master

 Get file’s metadata: list chunk handles
 Get the location of each of the chunk handles
 Multiple replicated chunkservers per chunk

 Contact any available chunkserver for chunk data

READ
WRITE

 Less frequent than reading

 Master grants a chunk lease to one of the replicas
 This replica will be the primary replica chunkserver
 Primary can request lease extensions, if needed
 Master increases the chunk version number and informs replicas
WRITE: PHASE 1 – SEND DATA

Deliver data but don’t write to the file

 A client is given a list of replicas
 Identifying the primary and secondaries
 Client writes to the closest replica chunkserver that has not received the data
 Replica forwards the data to another replica chunkserver
 That chunkserver forwards to another replica chunkserver …
 Chunkservers store this data in a cache – it’s not part of the file yet
Goal: Maximixe bandwidth via pipelining and Minimize latency by forwarding data as
soon as it is received
WRITE: PHASE 1 – WRITE DATA

Add it to the file (commit)

 Client waits for replicas to acknowledge receiving the data
 Sends a write request to the primary, identifying the data that was sent
 The primary is responsible for serialization of writes
 Assigns consecutive serial numbers to all writes that it received
 Applies writes in serial-number order and forwards write requests in order to secondaries
 Once all acknowledgements have been received, the primary acknowledges the client
WRITE

 Data Flow (phase 1) is different from

Control Flow (phase 2)
 Data Flow (upload): :
 Client to chunkserver to chunkserver to
chunkserver…
 Order does not matter
 Control Flow (write):
 Client to primary; primary to all secondaries
 Locking used; Order maintained
 Chunk version numbers are used to detect
if any replica has stale data (was not
updated because it was down)
NAMESPACE

 No per-directory data structure like most file systems

 E.g., directory file contains names of all files in the directory

 No aliases (hard or symbolic links)

 Namespace is a single lookup table
 Maps pathnames to metadata
HADOOP
HDFS
HDFS: HADOOP DISTRIBUTED FILE SYSTEM

 Primary storage system for Hadoop applications

 Hadoop – Software library – framework that allows for the distributed processing of large data
sets across clusters of computers
 Hadoop includes:
 MapReduce™: software framework for distributed processing of large data sets on compute clusters.
 Avro™: A data serialization system.
 Cassandra™: A scalable multi-master database with no single points of failure.
 Chukwa™: A data collection system for managing large distributed systems.
 HBase™: A scalable, distributed database that supports structured data storage for large tables.
 Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
 Mahout™: A Scalable machine learning and data mining library.
 Pig™: A high-level data-flow language and execution framework for parallel computation.
 ZooKeeper™: A high-performance coordination service for distributed applications.
HDFS DESIGN GOALS & ASSUMPTIONS

 HDFS is an open source (Apache) implementation inspired by GFS design

 Similar goals and same basic design as GFS
 Run on commodity hardware
 Highly fault tolerant
 High throughput – Designed for large data sets
 OK to relax some POSIX requirements
 Large scale deployments
 Instance of HDFS may comprise 1000s of servers
 Each server stores part of the file system’s data

 But
 No support for concurrent appends
HDFS DESIGN GOALS & ASSUMPTIONS

 Write-once, read-many-times file access model

 A file’s contents will not change
 Simplifies data coherency
 Suitable for web crawlers and MapReduce analytics applications
HDFS ARCHITECTURE

 Written in Java
 Master/Slave architecture

 Single NameNode
 Master server responsible for the namespace & access control
 Multiple DataNodes
 Responsible for managing storage attached to its node
 A file is split into one or more blocks
 Typical block size = 128 MB (vs. 64 MB for GFS)
 Blocks are stored in a set of DataNodes
GFS
HDFS
NAMENODE (= GFS MASTER)

 Executes metadata operations

 open, close, rename
 Maps file blocks to DataNodes
 Maintains HDFS namespace
 Transaction log (EditLog) records every change that occurs to file system metadata
 Entire file system namespace + file-block mappings is stored in memory
 … and stored in a file (FsImage) for persistence
 NameNode receives a periodic Heartbeat and Blockreport from each DataNode
 Heartbeat = “I am alive” message
 Blockreport = list of all blocks managed by a DataNode
 Keep track of which DataNodes own which blocks & their replication count
DATANODE (= GFS CHUNKSERVER)

 Responsible for serving read/write requests

 Blocks are replicated for fault tolerance
 App can specify # replicas at creation time
 Can be changed later

 Blocks are stored in the local file system at the DataNode

RACK-AWARE READS & REPLICA SELECTION

 Client sends request to NameNode

 Receives list of blocks and replica DataNodes per block

 Client tries to read from the closest replica

 Prefer same rack
 Else same data center
 Location awareness is configured by the admin
WRITES
 Client caches file data into a temp file
 When temp file ≥ one HDFS block size
 Client contacts NameNode
 NameNode inserts file name into file system hierarchy & allocates a data block
 Responds to client with the destination data block
 Client writes to the block at the corresponding DataNode
 When a file is closed, remaining data is transferred to a DataNode
 NameNode is informed that the file is closed
 NameNode commits file creation operation into a persistent store (log)
 Data writes are chained: pipelined
 Client writes to the first (closest) DataNode
 That DataNode writes the data stream to the second DataNode
 And so on…
INTERNET-BASED FILE SYNC & SHARING:
DROPBOX
FILE SYNCHRONIZATION

 Client runs on desktop

 Uploads any changes made within a dropbox folder
 Huge scale
 100+ million users syncing 1 billion files per day
 Design
 Small client that doesn’t take a lot of resources
 Expect possibility of low bandwidth to user
 Scalable back-end architecture
 99%+ of code written in Python
 infrastructure (storage, monitoring) software migrated to Go in 2013
 a few components running on storage boxes migrated to Rust for memory efficiency
WHAT’S DIFFERENT ABOUT DROPBOX?
 Most web-based apps have high read to write ratios
 E.g., twitter, facebook, reddit, … 100:1, 1000:1, or higher
 But with Dropbox…
 Everyone’s computer has a complete copy of their Dropbox
 Traffic happens only when changes occur
 File upload : file download ratio roughly 1:1
 Huge number of uploads compared to traditional services

 Must abide by most ACID requirements … sort of

 Atomic: don’t share partially-modified files
 Consistent:
 Operations have to be in order and reliable
 Cannot delete a file in a shared folder but have others see
 Durable: Files cannot disappear
DROPBOX: ARCHITECTURE EVOLUTION: VERSION 1

 One server: web server, app server, mySQL database, sync server
DROPBOX: ARCHITECTURE EVOLUTION: VERSION 2
 Server ran out of disk space: moved data to Amazon S3 service (key-value store)
 Servers became overloaded: moved mySQL DB to another machine
 Clients periodically polled server for changes
DROPBOX: ARCHITECTURE EVOLUTION: VERSION 3
 Move from polling to notifications: add notification server
 Split web server into two:
 Amazon-hosted server hosts file content and accepts uploads (stored as blocks)
 Locally-hosted server manages metadata
DROPBOX: ARCHITECTURE EVOLUTION: VERSION 4
 Add more metaservers and blockservers
 Blockservers do not access DB directly; they send RPCs to metaservers
 Add a memory cache (memcache) in front of the database to avoid scaling
DROPBOX: ARCHITECTURE EVOLUTION: VERSION 5
 10s of millions of clients – Clients have to connect before getting notifications
 Add 2-level hierarchy to notification servers: ~1 million connections/server
QUESTIONS?
NOW, BY E-MAIL, …

The Google File System: S. Ghemawat, H. Gobioff, and S. T. Leung. SOSP 2003
No ratings yet
The Google File System: S. Ghemawat, H. Gobioff, and S. T. Leung. SOSP 2003
33 pages
The Google File System: Firas Abuzaid
No ratings yet
The Google File System: Firas Abuzaid
22 pages
M4 - 05 - Google File System
No ratings yet
M4 - 05 - Google File System
28 pages
Introduction to Distributed Data Processing
No ratings yet
Introduction to Distributed Data Processing
2 pages
Distributed Computing Module 5 Important Topics PYQs
No ratings yet
Distributed Computing Module 5 Important Topics PYQs
23 pages
Case Study: Google File System
No ratings yet
Case Study: Google File System
7 pages
04 en Network File Systems
No ratings yet
04 en Network File Systems
57 pages
Google File System Architecture Overview
No ratings yet
Google File System Architecture Overview
40 pages
Overview of Google File System (GFS)
No ratings yet
Overview of Google File System (GFS)
20 pages
Unit 5 Lecture 2
No ratings yet
Unit 5 Lecture 2
22 pages
Google File System for Developers
No ratings yet
Google File System for Developers
28 pages
Google File System
No ratings yet
Google File System
22 pages
Lecture 14 HDFS GFS
No ratings yet
Lecture 14 HDFS GFS
30 pages
Distributed File System Google File System
No ratings yet
Distributed File System Google File System
44 pages
18-Distributed File Systems Study On Operating Systems
No ratings yet
18-Distributed File Systems Study On Operating Systems
24 pages
Lecture 4.1 - Hadoop - MapReduce - Hbase
No ratings yet
Lecture 4.1 - Hadoop - MapReduce - Hbase
94 pages
Chunky
No ratings yet
Chunky
3 pages
Cloud Storage Systems: Unit-Iii
No ratings yet
Cloud Storage Systems: Unit-Iii
40 pages
Chap 6
No ratings yet
Chap 6
54 pages
Overview of Google File System (GFS)
No ratings yet
Overview of Google File System (GFS)
40 pages
Paper Gfs Summary
No ratings yet
Paper Gfs Summary
14 pages
Cloud Application Requirements Overview
No ratings yet
Cloud Application Requirements Overview
21 pages
The Google File System: Alexandru Costan
No ratings yet
The Google File System: Alexandru Costan
38 pages
Storage Systems
No ratings yet
Storage Systems
23 pages
Chapter 2 1712934164766
No ratings yet
Chapter 2 1712934164766
21 pages
The Google File System
No ratings yet
The Google File System
21 pages
2 GFS
No ratings yet
2 GFS
30 pages
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
No ratings yet
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
51 pages
The File System: Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung (Google)
No ratings yet
The File System: Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung (Google)
31 pages
Google File System Overview
No ratings yet
Google File System Overview
9 pages
Google File System Overview
No ratings yet
Google File System Overview
9 pages
08 Chubby
No ratings yet
08 Chubby
14 pages
Thegooglefilesystem Lecturebyromainjacotin 141001154546 Phpapp02
No ratings yet
Thegooglefilesystem Lecturebyromainjacotin 141001154546 Phpapp02
52 pages
Overview of Google File System (GFS)
No ratings yet
Overview of Google File System (GFS)
20 pages
Unit-II (BIG DATA)
No ratings yet
Unit-II (BIG DATA)
9 pages
Google File System Overview and Analysis
No ratings yet
Google File System Overview and Analysis
35 pages
Storage Systems
No ratings yet
Storage Systems
23 pages
Chapter 2 Google File System 250525 070947
No ratings yet
Chapter 2 Google File System 250525 070947
42 pages
Refer Slide Time: 00:15
No ratings yet
Refer Slide Time: 00:15
31 pages
GFD Summary
No ratings yet
GFD Summary
3 pages
Google File System 1
No ratings yet
Google File System 1
48 pages
BDA Unit I
No ratings yet
BDA Unit I
18 pages
Gfs
No ratings yet
Gfs
15 pages
Google File System Overview
No ratings yet
Google File System Overview
18 pages
Unit 2
No ratings yet
Unit 2
22 pages
R16 4-1 BDA - Unit-2 (Ref-3)
No ratings yet
R16 4-1 BDA - Unit-2 (Ref-3)
22 pages
The Google File System: CSE 490h, Autumn 2008
No ratings yet
The Google File System: CSE 490h, Autumn 2008
29 pages
DS Lecture 5
No ratings yet
DS Lecture 5
28 pages
Hadoop and Big Data Unit 2
No ratings yet
Hadoop and Big Data Unit 2
11 pages
Distributed File Systems
No ratings yet
Distributed File Systems
28 pages
The Chubby Lock Service For Loosely-Coupled Distributed Systems
No ratings yet
The Chubby Lock Service For Loosely-Coupled Distributed Systems
64 pages
Google File System
No ratings yet
Google File System
48 pages
Chubby Lock Service Overview
No ratings yet
Chubby Lock Service Overview
12 pages
Distributed File System Study
No ratings yet
Distributed File System Study
4 pages
Google File System & Hadoop Overview
No ratings yet
Google File System & Hadoop Overview
22 pages
Overview of LFS, NFS, and AFS Systems
No ratings yet
Overview of LFS, NFS, and AFS Systems
37 pages
The FAT16 File System: Fragmentation 16-Bit
No ratings yet
The FAT16 File System: Fragmentation 16-Bit
4 pages
Sumit Dubey
No ratings yet
Sumit Dubey
2 pages
Linux Server Administration Question Bank
No ratings yet
Linux Server Administration Question Bank
3 pages
ARCHICAD Objectif BIM de L'esquisse À La Réalisation EYROLLES
No ratings yet
ARCHICAD Objectif BIM de L'esquisse À La Réalisation EYROLLES
482 pages
Bug Bounty Techniques with Bash
No ratings yet
Bug Bounty Techniques with Bash
37 pages
Overview of the CWatM Model
No ratings yet
Overview of the CWatM Model
16 pages
Lab 1
No ratings yet
Lab 1
4 pages
Os Te It Unit 1
No ratings yet
Os Te It Unit 1
18 pages
Introduction To C Programming
No ratings yet
Introduction To C Programming
3 pages
Shared Folder Issue
No ratings yet
Shared Folder Issue
2 pages
Libxlsxwriter - Working With VBA Macros
No ratings yet
Libxlsxwriter - Working With VBA Macros
4 pages
Essential Linux Commands Cheat Sheet
No ratings yet
Essential Linux Commands Cheat Sheet
17 pages
Install Vedabase 2025
No ratings yet
Install Vedabase 2025
2 pages
Note 1491848 - What IGS Should I Be Using With My Web AS
No ratings yet
Note 1491848 - What IGS Should I Be Using With My Web AS
2 pages
Tugire Amagara Mazima Edit - Action
No ratings yet
Tugire Amagara Mazima Edit - Action
13 pages
Android Log Analysis and Errors
100% (1)
Android Log Analysis and Errors
6 pages
Linux Systems Administrator Resume
No ratings yet
Linux Systems Administrator Resume
3 pages
Best Practices For Writing Dockerfiles - Docker Documentation
No ratings yet
Best Practices For Writing Dockerfiles - Docker Documentation
16 pages
Shell and C File Operations Script
No ratings yet
Shell and C File Operations Script
10 pages
Synopsis-Docker Container
No ratings yet
Synopsis-Docker Container
6 pages
SCS 307 Systems Programming Course Outline
No ratings yet
SCS 307 Systems Programming Course Outline
2 pages
Essential UNIX Commands Guide
No ratings yet
Essential UNIX Commands Guide
20 pages
Installing OpenSSL Ubuntu-16.04
No ratings yet
Installing OpenSSL Ubuntu-16.04
6 pages
Experiment 2 - Unix
No ratings yet
Experiment 2 - Unix
8 pages
801-834 Vim Commands
No ratings yet
801-834 Vim Commands
4 pages
How To Extract Tar
No ratings yet
How To Extract Tar
24 pages
Dumpstate Log
No ratings yet
Dumpstate Log
9 pages
IBM Powervm
No ratings yet
IBM Powervm
9 pages
File Compression and Archiving Techniques
No ratings yet
File Compression and Archiving Techniques
11 pages
Linux System Calls: Fork, Wait, Waitpid
No ratings yet
Linux System Calls: Fork, Wait, Waitpid
8 pages

05 en Distributed File Systems

Uploaded by

05 en Distributed File Systems

Uploaded by

BASICS OF DISTRIBUTED SYSTEMS

DISTRIBUTED FILE SYSTEMS

 Distributed lock service + simple fault-tolerant file system

 Lock services for distributed systems

 Chubby has at most one master

 All other nodes (replicas) must agree on who the master is

 User-level RPC interface

 Look up Chubby nodes via DNS

 Each file has

 Every file & directory can act as a reader-writer lock

 Locks are advisory

 Using Chubby locks makes leader election easy

 Example: electing a master & using it to write to a file server

 Client requests for session

 Clients may subscribe to events:

 Master writes to client cache

 File data is still centralized

 Scalable distributed file system

 Assumptions for conventional file systems don’t work

 Component failures are the norm, not an exception

 Files are huge. Multi-TB files are the norm

 GFS will store a modest number of files for its scale

 Conventional file systems

 Parallel file systems:

 Use separate servers to store metadata

 Use large logical blocks

 GFS does not have a standard OS-level API

 Chunk size = 64 MB (default)

 Chunk Handle: identifies a chunk

 Maintains all file system metadata

 Periodically communicates with all chunkservers

 GFS client code linked into each app

 Neither clients nor chunkservers cache data

 All metadata stored in master’s memory

 All operations logged

 Additional shadow Master can

 Default chunk size = 64MB

 Contact the master

 Contact any available chunkserver for chunk data

 Less frequent than reading

Deliver data but don’t write to the file

Add it to the file (commit)

 Data Flow (phase 1) is different from

 No per-directory data structure like most file systems

 No aliases (hard or symbolic links)

 Primary storage system for Hadoop applications

 HDFS is an open source (Apache) implementation inspired by GFS design

 Write-once, read-many-times file access model

 Executes metadata operations

 Responsible for serving read/write requests

 Blocks are stored in the local file system at the DataNode

 Client sends request to NameNode

 Client tries to read from the closest replica

 Client runs on desktop

 Must abide by most ACID requirements … sort of

You might also like