0% found this document useful (0 votes)
1K views6 pages

Anatomy OF File Write and Read

HDFS has a master-slave architecture with the NameNode acting as the master and DataNodes acting as slaves. The NameNode stores all metadata while DataNodes store data blocks. For write operations, clients interact with HDFS to split data into packets that are sent to DataNodes for storage. For read operations, the NameNode provides the DataNode locations to the client, who then retrieves data directly from those DataNodes.

Uploaded by

Dhavan Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views6 pages

Anatomy OF File Write and Read

HDFS has a master-slave architecture with the NameNode acting as the master and DataNodes acting as slaves. The NameNode stores all metadata while DataNodes store data blocks. For write operations, clients interact with HDFS to split data into packets that are sent to DataNodes for storage. For read operations, the NameNode provides the DataNode locations to the client, who then retrieves data directly from those DataNodes.

Uploaded by

Dhavan Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 6

ANATOMY

OF
FILE WRITE AND READ
• HDFS has master slave architecture. NameNode acts as master and
data node acts as slave.
• Namenode has all meta information and data node has original data.
• HDFS client will interact with HDFS to perform write and read
operations.
• In Write operation data will be stored in DataNodes sent by the
client.
• In Read operation data will be retrieved from DataNodes in client
machines.
ANATOMY OF FILE WRITE

DataOutputStream
step 1: client will create a file by calling create() method.
Step 2: DFS makes RPC (Remote Procedural Call) to Namenode to create a new
file
(name node performs various operations to check file doesn’t
already exists and client has permission to create file etc)
Step 3: as client write data DataOutputStream split into data packets.
Step 4: Data packets are formed in pipeline(dataQueue) and stored in
DataNode 1 and datanode2 and soon.
Step 5: After data packets are stored in all DataNodes acknowledgement
package will be sent to DataOutputStream to remove packets from pipeline.
Step 6: client has finished data writing , it calls close() method on the stream.
Step 7: FSimage will be updated in NameNode
ANATOMY OF FILE READ
Step 1: client will call open() method to open a file
Step 2: DFS calls NameNode using RPC to determine the location of data blocks.
NameNode will return the address of Data Node's that have block.
Step 3: then client will call read() method on the stream. DataInputStream which
store address of Data Node's which has blocks.
Step 4: Data is streamed from DataNode back to the client.
Step 5: When the end of blocks is reached DataInputStream will close the
connection with DataNodes.
Step 6: when client finished reading, the client will call close() method of
DataInputStream.

You might also like