0% found this document useful (0 votes)
10 views84 pages

Cs3551 DC Qns & Ans

The document discusses various topics in distributed computing, including Software Defined Networking (SDN) principles and architecture, cloud deployment and service models for an institution, the Chandy-Misra-Haas algorithm for deadlock detection, FIFO message queues in online banking, and adaptable technologies for a distributed environment. It emphasizes the importance of security, compliance, and scalability in cloud solutions, as well as the pros and cons of different distributed architectures like microservices and RESTful APIs. Additionally, it outlines implementation strategies and risk mitigation for various scenarios in distributed systems.

Uploaded by

agdanishr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views84 pages

Cs3551 DC Qns & Ans

The document discusses various topics in distributed computing, including Software Defined Networking (SDN) principles and architecture, cloud deployment and service models for an institution, the Chandy-Misra-Haas algorithm for deadlock detection, FIFO message queues in online banking, and adaptable technologies for a distributed environment. It emphasizes the importance of security, compliance, and scalability in cloud solutions, as well as the pros and cons of different distributed architectures like microservices and RESTful APIs. Additionally, it outlines implementation strategies and risk mitigation for various scenarios in distributed systems.

Uploaded by

agdanishr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

CS3551 – Distributed Computing

Additional Qns
Reg 2021
1.Prove that in a distributed computation, for an event the surface of a past cone (i.e., all the
events on the surface) form a consistent cut. Does it mean that all events on the surface of
thepast cone are always concurrent? Give an example to make your case.
2. Explain in detail software defined networking with architecture

Introduction to SDN (2 marks):


Software Defined Networking (SDN) is a network architecture approach
that separates the network control plane from the data plane, enabling centralized network
management and programmable network behavior.
Key Principles (2 marks):
1. Separation of Control and Data Planes
2. Centralized Control
3. Programmable Network Behavior
4. Global Network View
SDN Architecture Components (6 marks):
1. Application Layer:
• SDN Applications (Network Services)
• Business logic and policies
• Communicates via Northbound APIs
2. Control Layer:
• SDN Controller (Network OS)
• Centralized control logic
• Maintains global network view
• Policy enforcement
3. Infrastructure Layer:
• SDN Switches/Devices
• Forwarding elements
• Controlled via Southbound APIs
Key Interfaces (2 marks):
Northbound APIs:
• Interface between applications and controller
• RESTful APIs, Intent-based interfaces
• Examples: REST, NETCONF
Southbound APIs:
• Interface between controller and network devices
• Protocol: OpenFlow (primary), OVSDB, NetConf
SDN Controller Functions (1 mark):
• Flow table management
• Topology discovery
• Path computation
• Policy enforcement
• Load balancing
Benefits:
• Centralized control
• Network programmability
• Rapid service deployment
• Cost reduction
3. The management of an autonomous institution decided to migrate its website and database
from on-premises infrastructure to the Cloud. As an expert of cloud computing, you are
conducted by management to identify the best suited cloud deployment model and service model
for the institution. Describe the important factors you can consider the cloud deployment model
and service model for the institution. Also justify, in what way the selected models are best
suited for the institution compared to other models.

Institution Requirements Analysis (3 marks): For an autonomous institution migrating website


and database to cloud:
• Security Requirements: Student data protection, academic records confidentiality
• Compliance Needs: Educational regulations, data privacy laws
• Budget Constraints: Cost-effective solution for educational institution
• Scalability: Handle varying loads (admission periods, exam results)
• Availability: High uptime for academic operations

Cloud Deployment Model Selection (6 marks):


Recommended: Private Cloud
Factors Considered:
1. Security: Sensitive student and academic data requires enhanced security
2. Compliance: Educational institutions must comply with data protection regulations
3. Control: Need administrative control over infrastructure and data
4. Customization: Academic systems require specific configurations
Justification over other models:
• vs Public Cloud: Better security and compliance control
• vs Hybrid Cloud: Simpler management, no complexity of multi-cloud
• vs Community Cloud: Institution has specific requirements
Service Model Selection (4 marks):
Recommended: Platform as a Service (PaaS)
Factors Considered:
1. Development Focus: Institution can focus on application development
2. Database Management: Managed database services
3. Scalability: Automatic scaling for web applications
4. Maintenance: Platform maintenance handled by provider
Justification over other models:
• vs IaaS: Reduces infrastructure management overhead
• vs SaaS: Provides customization flexibility for academic needs
Implementation Strategy (2 marks):
1. Phased Migration: Start with website, then database
2. Data Backup: Comprehensive backup before migration
3. Staff Training: Cloud platform training for IT staff
4. Monitoring: Continuous performance and security monitoring
Cost-Benefit Analysis (1 mark):
• Reduced infrastructure costs
• Improved reliability and availability
• Enhanced disaster recovery capabilities
• Better resource utilization
4. Compare and contrast Chandy-Misra-Haas algorithm for the AND model and OR model
Chandy–Misra–Haas (CMH) Algorithm: AND vs OR Models

Deadlock detection in distributed systems is a challenging problem because the resources and
processes are spread across different sites. The Chandy–Misra–Haas (CMH) algorithm is one of the
most popular distributed deadlock detection algorithms.

It is classified into two models:


1. AND Model
2. OR Model

Theoretical Concepts and Assumptions

- Request Semantics: In the AND model, each process must acquire all its requested resources before
proceeding. In the OR model, a process requests several resources but can proceed if any one of them
is granted.
- Wait-For Graph (WFG): In the AND model, a cycle in the WFG always indicates a deadlock. In the
OR model, a cycle does not necessarily indicate deadlock unless it forms a knot.
- Dependency Definitions: A process Pj is dependent on Pk if Pj is blocked and (transitively) waiting
for a resource held by Pk.
- Communication Pattern: The AND model uses edge-chasing probes. The OR model uses diffusion
computation (query/reply messages).
- Assumptions: Both algorithms assume reliable message transmission and continuous blocking
during detection.

CMH Algorithm – AND Model


The AND-model algorithm is an edge-chasing probe algorithm. A blocked process Pi initiates
detection by sending probe messages along outgoing wait-for edges to other sites. Each probe is a
triplet (i,j,k) meaning “initiator Pi, sent by Pj’s site to Pk’s site”. Deadlock is declared when a probe
returns to Pi.

Pseudocode (AND Model):


Algorithm CMH_AND_Model(Pi):
if Pi is waiting for resource and request fails:
for each Pj that Pi waits for:
send probe(Pi, Pi, Pj)

On receiving probe(Pi, Pk, Pj):


if Pj is not waiting for resource:
discard probe
else if Pj waits for processes {Pm1, Pm2, ...}:
for each Pm in waiting list:
send probe(Pi, Pj, Pm)
if Pi == Pj:
report "Deadlock Detected"

Example:
Processes: P1, P2, P3, P4
Wait-for Graph: P1 → P2 → P3 → P1
Steps:
1. P1 initiates probe(P1, P1, P2)
2. P2 forwards to probe(P1, P2, P3)
3. P3 forwards to probe(P1, P3, P1)
4. Probe returns to initiator P1 ⇒ Deadlock Detected

Advantages:
- Easy to implement
- Each probe message is of fixed size
- Less computation and overhead
- No need for global WFG
- No false deadlocks

CMH Algorithm – OR Model


The OR-model algorithm uses diffusion computation with query and reply messages. A blocked
process Pi initiates a deadlock check by sending query messages to all processes in its dependent set.
The initiator Pi declares deadlock only if it receives all expected replies.

Pseudocode (OR Model):


Algorithm CMH_OR_Model(Pi):
if Pi is blocked:
for each Pj in dependent set:
send query(Pi, Pi, Pj)
num[i] = count(dependents)

On receiving query(Pi, Pj, Pk) at Pk:


if first query for deadlock initiated by Pi:
propagate to dependents of Pk
num[k] = count(dependents)
else if Pk continuously blocked:
send reply(Pi, Pk, Pj)

On receiving reply(Pi, Pj, Pk):


num[i] = num[i] - 1
if num[i] == 0:
if Pi == initiator:
report "Deadlock Detected"
else:
send reply to parent

Example:
Processes: P1, P2, P3
Dependencies:
P1 waits for P2 or P3
P2 waits for P3
P3 waits for P1

Steps:
1. P1 initiates query to P2 and P3
2. P2 forwards to P3; P3 forwards to P1
3. When all replies return to P1 ⇒ Deadlock Detected

Difference Between AND Model and OR Model


Chandy–Misra–Haas (AND Model) Chandy–Misra–Haas (OR Model)
Edge-chasing (Probe-based) Diffusion (Query/Reply)
Used when process needs all resources Used when process needs any one resource
Single message type: Probe(i,j,k) Two message types: Query(i,j,k) and
Reply(i,j,k)
Deadlock detected when probe returns to initiator Deadlock detected when all replies return to
initiator
Less communication overhead More message exchanges
Simpler and easy to implement Slightly complex to implement
No false (phantom) deadlocks Detects true deadlocks only (no phantom)
Efficient for AND-type systems Efficient for OR-type systems
No special data structures required Requires counters and flags for tracking
queries
Cycle indicates deadlock Knot indicates deadlock

Advantages and Limitations

AND Model Advantages:


- Simple implementation
- Fewer messages and lower overhead
- Detects cycles efficiently
- No false deadlocks

OR Model Advantages:
- Suitable for OR-style waiting conditions
- Handles complex resource dependencies
- Detects true deadlocks accurately

Limitations:
- AND model not suitable for OR conditions
- OR model has more message overhead and complexity

5. Give a real time scenario where FIFO message queue is used. Write and describe the
snapshot algorithms for FIFO channels

Real-time FIFO Message Queue Scenario (3 marks):


Scenario: Online Banking Transaction Processing
• System: Distributed banking system with multiple branches
• Requirement: Transaction requests must be processed in arrival order
• FIFO Usage:
• Account balance updates
• Transaction logs
• Inter-branch fund transfers
• Audit trail maintenance
Why FIFO is Critical:
• Maintains transaction ordering
• Prevents race conditions in balance updates
• Ensures consistency across distributed nodes
SnapShot Algorithm

Pesudo Code:
Correctness & Complexity

Consistency Properties (2 marks):


• Causal Consistency: Snapshot reflects causally consistent global state
• FIFO Ordering: Message order preserved in channel states
• Completeness: All processes participate in snapshot
• Termination: Algorithm terminates when all markers processed

6. A company like to have an advanced collaboration services like video structure. If they chat
and web conferences for their employees, but their system support any of the IT resources
due insufficient infrastructure could leverage cloud computing technology in their system,
suggest, suitable cloud type with proper justification. List the character cloud computing

Company Requirements Analysis (2 marks):


• Services Needed: Video conferencing, chat, web conferences
• Current Issue: Insufficient infrastructure to support IT resources
• Constraint: Need advanced collaboration without major infrastructure investment
Recommended Cloud Type: Public Cloud (3 marks):
Justification:
1. Cost-Effectiveness: No upfront infrastructure investment
2. Scalability: Handle varying collaboration demands
3. Quick Deployment: Immediate access to collaboration tools
4. Maintenance-Free: Provider handles infrastructure maintenance
Suitable Cloud Services (3 marks):
• SaaS Model: Microsoft 365, Google Workspace, Zoom
• Collaboration Tools: Teams, Slack, WebEx
• Video Infrastructure: Cloud-based video conferencing platforms
Implementation Strategy (2 marks):
1. Service Selection: Choose integrated collaboration suite
2. User Training: Staff training on cloud collaboration tools
3. Security Setup: Configure security policies and access controls
4. Integration: Integrate with existing business systems
Characteristics of Cloud Computing (3 marks):
Essential Characteristics:
1. On-Demand Self-Service: Users provision resources automatically
2. Broad Network Access: Services accessible over network
3. Resource Pooling: Multi-tenant resource sharing
4. Rapid Elasticity: Quick scaling up/down of resources
5. Measured Service: Pay-per-use resource monitoring
Service Models:
• SaaS: Software applications delivered over internet
• PaaS: Platform for application development and deployment
• IaaS: Virtualized computing infrastructure
Deployment Models:
• Public: Services over public internet

• Private: Dedicated infrastructure for single organization

• Hybrid: Combination of public and private

• Community: Shared infrastructure for specific community


7. Consider an ABC IT company wanted to provide services like scientific converter, data
converter, currency converter and some of the business logic as a server side component to
the developer to tailor make the application. Explain the various adaptable technologies to
implement in an distributed environment. State its merit and demerits.

Business Requirement Analysis (2 marks): ABC IT company needs to provide:


• Scientific converter services
• Data converter services
• Currency converter services
• Business logic as server-side components
• Platform for developers to build custom applications
Distributed Technologies for Implementation (10 marks):
1. Web Services Architecture (2.5 marks):
SOAP-based Web Services:
• Technology: SOAP, WSDL, UDDI
• Implementation: XML-based messaging
• Merit: Standardized, platform-independent, robust security
• Demerit: Heavy overhead, complex implementation
RESTful Web Services:
• Technology: HTTP, JSON/XML, REST principles
• Implementation: Resource-based URLs, HTTP methods
• Merit: Lightweight, easy to implement, cacheable
• Demerit: Less standardized, limited transaction support
2. Microservices Architecture (2.5 marks):
• Technology: Docker containers, Kubernetes orchestration
• Implementation: Independent service deployment
• Merit: Scalability, technology diversity, fault isolation
• Demerit: Complexity, network latency, distributed system challenges
3. Enterprise Service Bus (ESB) (2.5 marks):
• Technology: Message brokers, service registry
• Implementation: Centralized integration platform
• Merit: Service orchestration, protocol transformation
• Demerit: Single point of failure, performance bottleneck
4. Cloud-Native Solutions (2.5 marks):
• Technology: AWS Lambda, Azure Functions, Google Cloud Functions
• Implementation: Serverless computing model
• Merit: Auto-scaling, pay-per-execution, maintenance-free
• Demerit: Vendor lock-in, cold start latency
Recommended Architecture (2 marks):
Hybrid Approach:
• API Gateway: Centralized entry point
• Microservices: Individual converter services
• Container Orchestration: Kubernetes deployment
• Service Mesh: Inter-service communication
Implementation Benefits (1 mark):
• Scalable architecture
• Technology flexibility
• Independent deployment
• Developer-friendly APIs
Risk Mitigation (1 mark):
• Load balancing
• Circuit breaker pattern
• Monitoring and logging
• Security implementation
Step 1: Identify the Requirements

The ABC IT company needs to provide services like scientific converter, data converter, currency
converter, and business logic as server-side components. These services need to be adaptable in a
distributed environment.

Step 2: Choose Suitable Technologies

Several technologies can be considered for implementing these services in a distributed environment:

1. Microservices Architecture

2. RESTful APIs

3. GraphQL

4. Message Brokers (e.g., Kafka, RabbitMQ)

5. Containerization (e.g., Docker, Kubernetes)

Step 3: Evaluate Microservices Architecture

Merits:

 Scalability: Each service can be scaled independently.

 Flexibility: Different services can use different technologies.

 Resilience: Failure in one service doesn't affect others.

Demerits:

 Complexity: Managing multiple services can be complex.

 Latency: Inter-service communication can introduce latency.

Step 4: Evaluate RESTful APIs

Merits:

 Simplicity: Easy to understand and implement.

 Interoperability: Widely supported across different platforms.

Demerits:

 Over-fetching/Under-fetching: May return more or less data than needed.

 Statelessness: Each request must contain all necessary information.


Step 5: Evaluate GraphQL

Merits:

 Flexibility: Clients can request exactly the data they need.

 Single Endpoint: Reduces the number of API calls.

Demerits:

 Complexity: Requires a different mindset and learning curve.

 Performance: Can be slower if not optimized properly.

Step 6: Evaluate Message Brokers

Merits:

 Decoupling: Services can communicate asynchronously.

 Scalability: Can handle high throughput and large volumes of messages.

Demerits:

 Complexity: Requires managing message queues and ensuring delivery.

 Latency: Asynchronous communication can introduce delays.

Step 7: Evaluate Containerization

Merits:

 Portability: Services can run consistently across different environments.

 Efficiency: Lightweight and resource-efficient.

Demerits:

 Overhead: Managing containers can be complex.

 Learning Curve: Requires understanding of container orchestration.


8. An organization has planned to implement a client module that emulates a conventional file
system interface for application programs, and server modules, that perform operations for
clients on directories and on files. State the best distributed architecture to deploy and deliver
the system. Also explain the suitable system model to avoid failures and adapt a recovery system
for the same in case of unavoidable failures.

System Requirements Analysis (2 marks):

• Client Module: Emulates conventional file system interface

• Server Modules: Handle file and directory operations

• Requirements: Transparency, scalability, fault tolerance

• Operations: File read/write, directory management

Recommended Distributed Architecture (6 marks):

Architecture: Client-Server with Distributed File System

Components:

1. Client Interface Layer:

• Virtual File System (VFS) interface

• Provides POSIX-compliant file operations

• Handles local caching and consistency


2. Metadata Server:

• Maintains directory structure

• File location information

• Access control and permissions

• Namespace management

3. Data Servers:

• Store actual file data

• Handle read/write operations

• Data replication and striping

• Load balancing

4. Communication Layer:

• RPC mechanisms

• Network protocols (NFS, CIFS)

• Message passing interface

Architecture Benefits:

• Scalability: Horizontal scaling of data servers

• Performance: Distributed load across servers

• Transparency: Standard file system interface

System Model for Failure Handling (4 marks):

Failure Types:

1. Node Failures: Server crashes, network partitions

2. Byzantine Failures: Malicious or corrupted servers

3. Network Failures: Communication link failures

Fault Tolerance Mechanisms:

1. Replication: Data stored on multiple servers

2. Consensus Protocols: Paxos/Raft for consistency


3. Heartbeat Monitoring: Failure detection

4. Checkpointing: Periodic state snapshots

Recovery System Design (4 marks):

Recovery Strategies:

1. Data Recovery:

• Primary-Backup Replication: Hot standby servers

• Quorum-based Replication: Majority voting

• Version Control: File versioning for rollback

2. Metadata Recovery:

• Distributed Metadata: Replicated across servers

• Transaction Logs: Write-ahead logging

• Checkpoint Recovery: Consistent snapshots

3. Client Recovery:

• Client-side Caching: Local data caching

• Cache Coherency: Invalidation protocols

• Session Recovery: Stateless operations

4. Network Recovery:

• Alternative Routing: Multiple communication paths

• Timeout and Retry: Automatic retry mechanisms

• Load Redistribution: Dynamic load balancing

Implementation Details:

• Consensus Algorithm: Raft for leader election

• Consistency Model: Sequential consistency

• Failure Detection: Timeout-based monitoring

• Recovery Time: Minimize system downtime


9. Discriminate the Lamport's bakery algorithm in detail by providing the same algorithm.
CS3551 – Distributed Computing
Reg 2017 Ans
1. Tabulate the Interaction of the software components at each processor in the
distributed system
2.Analyse the Correctness criteria of the deadlock detection algorithm.

1. Safety Property (No False Positives):


• Algorithm must not report deadlock when none exists
• Verification through resource allocation graph analysis
• Ensures system doesn't unnecessarily terminate processes
2. Liveness Property (No False Negatives):
• Algorithm must detect actual deadlocks
• Complete cycle detection in wait-for graphs
• Ensures deadlocked processes are eventually identified
3. Progress Property:
• Algorithm must terminate in finite time
• Bounded message complexity
• No infinite loops in detection logic
4. Consistency Property:
• Consistent view of system state across all processes
• Synchronized snapshot mechanisms
• Causal ordering of detection messages
Verification Methods:
• Formal Proof: Mathematical correctness verification
• Simulation Testing: Scenario-based validation
• Invariant Checking: System state consistency
• Performance Analysis: Time and message complexity
3.Indicate the several interesting features of the Manivannan Singhal quasi-
synchronous check pointing algorithm.
4. Compare the results and lower bounds on agreement on solving the consensus
problem under different assumption.
Consensus Problem Overview:

Distributed processes must agree on a common value despite failures

and asynchronous communication.

Results and Lower Bounds Comparison:

1. Synchronous Systems:

• Result: Consensus possible with f < n failures

• Lower Bound: f + 1 rounds required


• Assumption: Known upper bound on message delays

2. Asynchronous Systems:

• Result: FLP Impossibility - no deterministic solution

• Lower Bound: No algorithm can guarantee termination

• Assumption: At least one process can fail

3.Partially Synchronous Systems:

• Result: Consensus achievable with randomization or failure detectors

• Lower Bound: GST (Global Stabilization Time) dependent

• Assumption: Eventually synchronous behavior

4. Byzantine Fault Model:

• Result: Requires n ≥ 3f + 1 processes

• Lower Bound: 2f + 1 rounds in synchronous case

• Assumption: Up to f malicious failures

5. Infer the use of causal order in updating replicas of a data item in the
communication system.

Causal Order Definition:

If event A causally precedes event B, then A must be processed before B at all

Replication as to maintain consistency.

Importance in Replication:

1. Consistency Maintenance:

• Preserves causality relationships


• Prevents anomalous states

• Ensures logical correctness

2. Update Ordering:

• Updates applied in causal order

• Vector timestamps for ordering

• Dependency tracking mechanisms

Implementation Mechanisms:

1. Vector Clocks:

• Maintains causal relationships

• Determines update ordering

• Detects concurrent updates

2. Causal Broadcast:

• Delivers messages in causal order

• Buffers out-of-order messages

• Ensures consistency across replicas

3. Update Propagation:

• Immediate propagation with ordering

• Buffering for consistency

• Conflict resolution protocols

Benefits:

• Maintains data integrity

• Preserves application semantics

• Enables optimistic replication

• Supports concurrent operations

Example Application:

• Collaborative editing systems


• Distributed databases

• File system replication

• Social media updates

6. What is consistency? Differentiate between sequential and causal consistency


model. Discuss the strategies employed for replacement while the shared memory
gets filled with replicated or migratory data.

What is Consistency? (3 marks)

Consistency in distributed systems refers to the requirement that all nodes in the system
have the same view of data at any given time. It ensures that when multiple processes
access shared data concurrently, the system behavior appears correct and predictable to
all users.

Key aspects:

• Data Consistency: All replicas contain identical data

• Temporal Consistency: Updates appear in logical order

• Causal Consistency: Related operations maintain proper ordering

Sequential vs Causal Consistency Models (8 marks)

Sequential Consistency Model (4 marks)

Definition:

All processes observe the same sequential order of operations, and operations from
each process appear in program order.

Characteristics:

• Global Order: All operations appear to be executed in some sequential order

• Program Order: Operations from same process maintain their original order

• Atomicity: Each operation appears to be executed atomically

• Linearizability: Stronger form ensures real-time ordering

Example:

Process P1: Write(X,1) →→ Write(Y,2)

Process P2: Read(Y,2) →→ Read(X,1)


Valid if: All processes see same global order

Implementation Requirements:

• Total ordering of all operations

• Synchronization mechanisms

• Global timestamp coordination

Causal Consistency Model (4 marks)

Definition:

Operations that are causally related are seen in the same order by all processes, but
concurrent operations can be seen in different orders.

Characteristics:

• Causal Relationships: If A→B causally, then A appears before B everywhere

• Concurrent Freedom: Unrelated operations can appear in any order

• Local Program Order: Operations within same process maintain order

• Weaker than Sequential: More flexible, better performance

Causal Relations:

1. Same Process: Operations in same process are causally related

2. Message Passing: Send causally precedes corresponding receive

3. Transitivity: If A→B and B→C, then A→C

Example:

Process P1: Write(X,1) causally precedes Write(Y,2)

Process P2: Can see different order for unrelated operations

Concurrent operations from P3 can be ordered differently

Implementation:

• Vector clocks for causal ordering

• Dependency tracking mechanisms

• Local buffering and reordering


Memory Replacement Strategies for Shared Memory (5 marks)

When shared memory becomes filled with replicated or migratory data, replacement
strategies are essential:

1. Least Recently Used (LRU) Strategy (1.5 marks)

• Principle: Replace the least recently accessed page

• Implementation: Timestamp-based tracking

• Advantage: Good temporal locality exploitation

• Disadvantage: High overhead for timestamp maintenance

2. Modified LRU for Distributed Systems (1.5 marks)

• Global LRU: Considers access patterns across all nodes

• Local LRU with Coordination: Each node maintains local LRU with periodic coordination

• Weighted LRU: Considers access frequency and data importance

3. Coherence-Aware Replacement (1 mark)

• Write-Back Priority: Keep modified pages longer

• Sharing Pattern Analysis: Consider how many nodes access the data

• Migration Cost: Factor in network overhead for page migration

4. Adaptive Strategies (1 mark)

• Dynamic Algorithm Selection: Choose strategy based on access patterns

• Load-Sensitive Replacement: Consider system load and network conditions

• Application-Aware Policies: Customize based on application requirements

Performance Metrics:

• Cache hit ratio improvement

• Network traffic reduction

• Memory utilization efficiency

• Response time optimization


8. Consider an e-shopping system capable of handling more user requesta at a
time from various geographic locations. The system will fail as it is designed to
handle only minimal count of requests at delayed time intervals. Propose at least 3
design constraints for enhancing the current system and discuss 3 associated
challenges.

Current System Analysis (2 marks)

Limitations:

• Minimal concurrent request handling capacity

• Delayed processing intervals

• Geographic distribution challenges

• Scalability bottleneck

Failure Scenarios:

• System overload during peak hours

• Poor response times from distant locations

• Resource exhaustion under high load

Design Constraints for Enhancement (9 marks)


Constraint 1: Horizontal Scalability Architecture (3 marks)

Design Approach:

• Microservices Architecture: Decompose monolithic system into independent services

• Load Balancer Implementation: Distribute requests across multiple server instances

• Auto-scaling Mechanisms: Dynamic resource allocation based on demand

Technical Implementation:

• Container orchestration (Kubernetes)

• Service mesh for inter-service communication

• Database sharding and replication

Benefits:

• Linear scalability with demand

• Independent service deployment

• Fault isolation between services

Constraint 2: Geographic Distribution with CDN (3 marks)

Design Approach:

• Content Delivery Network: Cache static content at edge locations

• Regional Data Centers: Deploy services closer to user populations

• Geo-load Balancing: Route users to nearest available datacenter

Technical Implementation:

• Multi-region cloud deployment

• Database replication across regions

• Content caching strategies

Benefits:

• Reduced latency for global users

• Improved availability during regional failures

• Better user experience across locations


Constraint 3: Asynchronous Processing Framework (3 marks)

Design Approach:

• Message Queue Implementation: Decouple request processing from response

• Event-driven Architecture: Use events for inter-service communication

• Background Job Processing: Handle time-intensive operations asynchronously

Technical Implementation:

• Message brokers (RabbitMQ, Apache Kafka)

• Event sourcing patterns

• CQRS (Command Query Responsibility Segregation)

Benefits:

• Better response times for users

• Improved system throughput

• Resilience to temporary failures

Associated Challenges (5 marks)

Challenge 1: Data Consistency Across Distributed Components (1.7 marks)

Problem:

• Maintaining ACID properties across distributed databases

• Eventual consistency vs strong consistency trade-offs

• Synchronization overhead between services

Solutions:

• Implement distributed transaction protocols

• Use saga patterns for long-running transactions

• Choose appropriate consistency models per use case

Challenge 2: Network Latency and Partition Tolerance (1.7 marks)

Problem:

• Network delays between geographic regions


• Handling network partitions gracefully

• CAP theorem trade-offs (Consistency, Availability, Partition tolerance)

Solutions:

• Implement circuit breaker patterns

• Use eventual consistency where appropriate

• Design for graceful degradation

Challenge 3: Complex System Monitoring and Debugging (1.6 marks)

Problem:

• Distributed tracing across multiple services

• Log aggregation and analysis

• Performance bottleneck identification

Solutions:

• Implement distributed tracing systems

• Centralized logging with correlation IDs

• Real-time monitoring and alerting systems

9. External synchronization ensures internal synchronization. But the vice versa


does not stand true. Justify. Explain Lamport's algorithm in brief.

External vs Internal Synchronization Analysis (8 marks)

Definitions (2 marks)

External Synchronization:

• Clock synchronization with external reference (UTC, GPS, NTP servers)

• Absolute time accuracy across all nodes

• Global time standard compliance

Internal Synchronization:

• Clock synchronization among distributed system nodes only

• Relative time consistency within the system


• No external time reference required

Why External Synchronization Ensures Internal Synchronization (3 marks)

Logical Proof:

1. External Reference: All nodes synchronize with same external source (UTC)

2. Transitive Property: If A syncs with UTC and B syncs with UTC, then A and B are
synchronized

3. Bound Accuracy: External sync provides bounded error guarantees

4. Global Consistency: All nodes maintain same global time view

Mathematical Representation:

Why Internal Synchronization Doesn't Ensure External Synchronization (3 marks)

Counter-example Scenario:

1. Drift Isolation: All internal clocks drift together from UTC

2. No External Reference: System maintains internal consistency but drifts from real time

3. Systematic Error: All nodes could be running fast/slow relative to UTC

Practical Example:

Key Insight: Internal synchronization only ensures relative accuracy, not absolute
accuracy.
10.What are the significant factors affecting the interacting processes in a
distributed systems? How the interaction model deals with the difficulty of setting
time limits in a distributed system?
1. Communication Latency and Bandwidth (2 marks)

Impact on Interaction:

• Variable Delays: Network conditions affect message delivery times

• Bandwidth Limitations: Affects throughput of bulk data transfer

• Geographic Distance: Physical distance increases latency

Mitigation Strategies:

• Asynchronous communication patterns

• Data compression techniques

• Strategic replica placement

2. Network Reliability and Failures (2 marks)

Types of Failures:

• Message Loss: Network packets dropped or corrupted

• Network Partitions: Temporary isolation of system components

• Link Failures: Permanent or temporary connection loss

Impact on Design:
• Need for timeout mechanisms

• Retry and acknowledgment protocols

• Graceful degradation strategies

3. Process Heterogeneity (2 marks)

Challenges:

• Different Hardware: Varying processing speeds and architectures

• Operating System Differences: Different system calls and behaviors

• Data Representation: Byte ordering, data format differences

Solutions:

• Standardized communication protocols

• Data serialization formats (JSON, Protocol Buffers)

• Middleware abstraction layers

4. Concurrency and Synchronization (2 marks)

Synchronization Challenges:
• Race Conditions: Multiple processes accessing shared resources

• Deadlock Prevention: Avoiding circular wait conditions

• Consistency Maintenance: Ensuring coherent global state

Coordination Mechanisms:

• Distributed locks and semaphores

• Consensus algorithms

• Event ordering protocols


Interaction Models and Time Limits (8 marks)

Time Limit Difficulties in Distributed Systems (4 marks)

Core Problems

1. Clock Synchronization Issues (1 mark):

• Different nodes have different local times

• Clock drift and skew problems

• Lack of global time reference

2. Variable Network Delays (1 mark)

• Unpredictable message transmission times

• Network congestion effects

• Route changes affecting latency

3. Processing Time Variations (1 mark)

• Different hardware capabilities

• Varying system loads

• Garbage collection and OS scheduling effects

4. Failure Detection Ambiguity (1 mark)


• Cannot distinguish slow process from failed process

• Timeout values difficult to set appropriately

• False failure detections vs missed failures

How Interaction Models Address Time Limit Difficulties (4 marks)

1. Synchronous Interaction Model (1 mark)


• Assumption: Known upper bounds on message delays and processing times

• Approach: Use worst-case timeout values

• Limitation: Often too conservative, leading to poor performance

2. Asynchronous Interaction Model (1 mark)

• Assumption: No bounds on message delays

• Approach: Event-driven programming, no timeouts

• Benefit: No false timeout failures

• Limitation: Cannot detect actual failures

3. Partially Synchronous Model (1 mark)

• Assumption: Bounds exist but are unknown or hold only eventually

• Approach: Adaptive timeout mechanisms

• Implementation: Failure detectors with varying accuracy

4. Real-time Interaction Model (1 mark)

• Assumption: Hard real-time constraints

• Approach: Guaranteed response time bounds

• Implementation: Resource reservation and admission control

• Application: Safety-critical systems

Practical Solutions:

• Heartbeat mechanisms for failure detection


• Adaptive timeout adjustment algorithms
• Quality-of-service specifications
• Circuit breaker patterns for fault tolerance
11.What are the key assumptions underlying while designing agreement
algorithms and brief them?
12.Discuss in detail the requirements that mutual exclusion algorithms should
satisfy and discuss what metric we use to measure the performance of mutual
exclusion algorithms.
13.Compare Synchronous versus asynchronous execution.
14.What are the functions must be addressed while designing and building a
distributed system? Explain.

Core Design Functions (10 marks)

1. Communication and Messaging (2 marks)

Communication Paradigms:

• Message Passing: Point-to-point and multicast communication

• Remote Procedure Calls: Transparent remote method invocation


• Event-based Communication: Publish-subscribe patterns

• Stream Processing: Continuous data flow handling

Protocol Stack:

• Application layer protocols (HTTP, gRPC)

• Transport layer reliability (TCP, UDP)

• Network routing and addressing

• Data serialization and marshaling

2. Naming and Directory Services (2 marks)

Naming Systems:

• Unique Identification: Global unique identifiers for all resources

• Name Resolution: Mapping names to network addresses

• Hierarchical Namespaces: Structured naming schemes

• Location Transparency: Hide physical resource locations

Directory Services:

• Service discovery mechanisms

• Resource registration and lookup

• Metadata management

• Access control and permissions

3. Synchronization and Coordination (2 marks)

Synchronization Primitives:

• Distributed Locks: Mutual exclusion across network

• Semaphores: Resource counting and limiting

• Barriers: Process synchronization points

• Condition Variables: Event-based coordination

Coordination Protocols:
• Leader election algorithms

• Consensus and agreement protocols

• Atomic broadcast mechanisms

• Distributed transaction management

4. Fault Tolerance and Reliability (2 marks)

Failure Handling:

• Failure Detection: Heartbeat and timeout mechanisms

• Failure Recovery: Process restart and state restoration

• Fault Masking: Redundancy and replication strategies

• Graceful Degradation: Partial functionality under failures

Reliability Techniques:

• Checkpointing and rollback recovery

• Message logging and replay

• Byzantine fault tolerance

• Error correction and detection codes

5. Security and Access Control (2 marks)

Security Functions:

• Authentication: Identity verification mechanisms

• Authorization: Access permission enforcement

• Encryption: Data confidentiality protection

• Integrity: Data tampering detection and prevention

Security Protocols:
• Public key infrastructure (PKI)

• Digital signatures and certificates

• Secure communication channels (TLS/SSL)

• Intrusion detection and prevention


Implementation Considerations (6 marks)

1. Performance Optimization (2 marks)

Optimization Strategies:

• Caching: Local and distributed caching mechanisms

• Load Balancing: Request distribution across servers

• Resource Management: CPU, memory, and network optimization

• Scalability Planning: Horizontal and vertical scaling approaches

Performance Monitoring:

• Real-time performance metrics collection

• Bottleneck identification and resolution

• Capacity planning and resource allocation

• Service level agreement (SLA) monitoring

2. Data Management (2 marks)

Data Distribution:

• Replication: Data copies for availability and performance

• Partitioning: Data distribution across multiple nodes

• Consistency Models: Strong vs eventual consistency trade-offs

• Transaction Management: ACID properties in distributed environment

Storage Systems:

• Distributed file systems

• NoSQL and distributed databases

• Data warehousing and analytics

• Backup and disaster recovery

3. System Integration (2 marks)


Integration Challenges:

• Interoperability: Different systems working together

• Legacy System Integration: Connecting old and new systems

• API Management: Standardized interfaces and protocols

• Middleware Solutions: Enterprise service buses and message brokers

Development and Deployment:

• Continuous integration/continuous deployment (CI/CD)

• Containerization and orchestration

• Configuration management

• Version control and rollback mechanisms

15.How do you classify a parallel system and brief them ?


CS3551 DC
Additional Qns
1. What is distributed program? Explain the model of Distrbuted computation
2.Distributed Communications
3. Explain Christians & Berkeley algorithm for synchronisation clocks
4.Logical Time
5.Causal Order & Total Order
Kshem Kalyani Algorithm PesudoCode:
7.Singhal Kshem Kalyani Algorithm:

You might also like