Answers to Test Questions - Distributed
Systems
Q1:
Evaluate the advantages and limitations of using HDFS over Sun Network File System (NFS)
in managing large-scale distributed data in cloud-based environments. Support your
answer with real-world scenarios.
Answer:
Advantages of HDFS over NFS:
1. 1. Scalability: HDFS is designed for high scalability and can manage petabytes of data
across thousands of nodes. NFS is not optimized for large-scale data.
2. 2. Fault Tolerance: HDFS replicates data blocks (default 3 copies), ensuring fault
tolerance. NFS has limited fault tolerance and relies more on external backup
mechanisms.
3. 3. Data Locality: HDFS processes data where it is stored (data locality), improving
performance. NFS often requires data transfer to compute nodes.
4. 4. Cost-Effective: HDFS runs on commodity hardware, while NFS might need high-
performance servers for stability.
5. 5. Batch Processing Support: HDFS integrates seamlessly with big data tools like
MapReduce, Hive, and Spark.
Limitations of HDFS:
6. 1. High Latency for Small Files: HDFS is inefficient with small file operations due to
block-based storage.
7. 2. Complex Setup and Management: Requires a learning curve and proper configuration.
8. 3. Write-Once-Read-Many: HDFS supports append and read, not full file modification.
Real-World Example:
- HDFS Use Case: Yahoo uses HDFS to store and analyze web logs for user behavior
analytics.
- NFS Use Case: Enterprises use NFS for shared file systems in small to medium-scale
applications (e.g., team collaboration tools).
Q2:
Evaluate the role of wearable and embedded devices in transforming distributed computing
architectures. Discuss the challenges and opportunities they introduce.
Answer:
Role in Distributed Computing:
9. 1. Edge Computing Enablers: Wearables and embedded devices push data processing to
the network edge, reducing latency.
10. 2. Real-Time Data Collection: These devices continuously collect data for real-time
analysis.
11. 3. Enhanced User Interaction: Enable context-aware computing and smart
environments.
Opportunities:
12. 1. Personalized Services: Real-time data enables hyper-personalization.
13. 2. Improved Efficiency: Real-time monitoring and automation lead to predictive
maintenance and resource optimization.
14. 3. Data-Driven Insights: Massive data generation supports AI and analytics models.
Challenges:
15. 1. Security and Privacy: Data from wearables includes sensitive personal information.
16. 2. Resource Constraints: Limited processing power, memory, and battery life restrict
functionality.
17. 3. Interoperability Issues: Diverse platforms and standards can hinder integration.
Example:
- Smart Healthcare: Wearables like Fitbit monitor heart rate and activity, sharing data with
cloud systems for diagnosis and alerts.