0% found this document useful (0 votes)
45 views24 pages

CC Unit-1

Cloud computing

Uploaded by

thalaivar.ram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views24 pages

CC Unit-1

Cloud computing

Uploaded by

thalaivar.ram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to Clouds

Cloud computing is a model that provides on-demand access to shared resources like servers,
storage, databases, applications, and more, all delivered over the internet. Instead of owning
physical hardware or data centers, companies can rent these resources from a cloud provider,
reducing costs and improving scalability.
Types of Cloud Models:
1. Public Cloud: Resources are owned and operated by third-party providers, like AWS,
Microsoft Azure, and Google Cloud, and delivered over the public internet.
2. Private Cloud: Used exclusively by a single organization, either managed internally
or by a third-party provider but hosted privately.
3. Hybrid Cloud: A combination of public and private clouds, allowing data and
applications to be shared between them for greater flexibility and optimization.
Introduction to Cloud Computing Concepts
Cloud computing refers to the delivery of computing services—such as storage, processing
power, databases, networking, software, and analytics—over the internet ("the cloud").
Instead of relying on local servers or personal computers, organizations can access these
resources on-demand from cloud providers, paying only for what they use.
Key Characteristics:
1. On-demand Service: Users can provision computing resources automatically,
without needing human intervention from the service provider.
2. Broad Network Access: Resources are accessible from anywhere via standard
network protocols and devices, including laptops, tablets, and mobile phones.
3. Resource Pooling: Multiple customers share resources in a multi-tenant environment,
with resources dynamically allocated based on demand.
4. Rapid Elasticity: Cloud computing can scale resources up or down as needed,
appearing unlimited to the user.
5. Measured Service: Cloud systems automatically monitor and control resource usage,
allowing for a pay-per-use model.
Cloud Service Models:
 Infrastructure as a Service (IaaS): Provides virtualized computing resources over
the internet. Examples: Amazon EC2, Google Compute Engine.
 Platform as a Service (PaaS): Offers a platform allowing customers to build, run,
and manage applications without worrying about the underlying infrastructure.
Examples: Microsoft Azure, Google App Engine.
 Software as a Service (SaaS): Delivers software applications over the internet, with
providers managing everything from servers to storage. Examples: Gmail, Salesforce,
Office 365.
Orientation Towards Cloud Computing Concepts
Cloud computing shifts the traditional view of IT infrastructure by moving resources to a
more dynamic, scalable, and accessible environment. Organizations no longer need to invest
in costly hardware or deal with physical maintenance, as cloud providers handle everything
from hardware to security.
Benefits of Cloud Computing:
1. Cost Efficiency: Reduces the need for capital investments in hardware, while offering
a flexible pay-per-use pricing model.
2. Scalability: Organizations can easily scale their infrastructure based on real-time
needs, allowing them to handle growing data and traffic efficiently.
3. Accessibility: Cloud services are accessible from any internet-connected device,
promoting flexibility for remote work and global collaboration.
4. Disaster Recovery: Cloud providers often include backup and disaster recovery
solutions, ensuring data is protected and can be restored quickly in case of failures.
5. Innovation: Cloud platforms provide cutting-edge tools (AI, machine learning, big
data analytics) that enable organizations to innovate faster and develop more
sophisticated applications.
Some Basic Computer Science Fundamentals
Understanding cloud computing concepts requires a foundational knowledge of several key
computer science concepts:
I. Basic Data Structures
1. Queue
o Definition: A first-in, first-out (FIFO) data structure.

o Dequeuing sequence:
 First: 3
 Then: 5
 Next: 8
 Continues in this manner.
2. Stack Definition: A first-in, last-out (FILO) data structure.
o Operations:
 Push: Insert 9 (goes to the top).
 Pop: Removes 9.
 Next pop removes 3.
 Following pop removes 5, and so on.
II. Processes
 Definition: A program in action.
 Overview of process components and behavior.

III. Computer Architecture (Simplified)


Overview
A program written in languages like C++ or Java is compiled into low-level machine
instructions, which are stored in the file system.
The CPU loads instructions into memory (including cache and registers) in batches.
o As each instruction executes, the CPU retrieves the necessary data from
memory and performs any required stores.
o Memory can also be flushed to disk.
o This is a simplified view but effective for understanding basic concepts.
IV. Big O Notation
1. Definition
o A fundamental method for analyzing algorithms, describing the upper bound
on an algorithm's behavior as a variable approaches infinity.
o Focuses on run-time or performance metrics, specifically worst-case scenarios.
2. Informal Definition
o An algorithm A is O(foo) if it completes in c⋅foo time for some constant c,
beyond a certain input size N.
o Common examples include:
 O(N)
 O(N2)
3. Examples
o Example 1: Searching for an element in an unsorted list is O(N), where N is
the list size.
 Worst-case performance occurs when the element is absent or is the
last one in the list, leading to NNN operations (e.g.,
operations<c⋅N\text{operations} < c \cdot Noperations<c⋅N, where
c=2c = 2c=2).
V. Basic Probability
1. Concepts
o Set: A collection of items (e.g., SSS = “Set of all humans in the world”).
o Subset: A collection that is part of a larger set (e.g., S2S2S2 = “Set of all
humans in Europe,” where S2S2S2 is a subset of SSS).
2. Probability of Events
o Any event has a probability of occurring.
o Example: If you wake up at a random hour, the probability of it being between
10 AM and 11 AM is 124\frac{1}{24}241.
3. Multiplying Probabilities
o For independent events E1E1E1 and E2E2E2:
 Prob(E1 AND E2)=Prob(E1)×Prob(E2)\text{Prob}(E1 \text{ AND }
E2) = \text{Prob}(E1) \times
\text{Prob}(E2)Prob(E1 AND E2)=Prob(E1)×Prob(E2)
o Example: Probability of waking up between 10 AM and 11 AM AND wearing
a green shirt:
 124×13=172\frac{1}{24} \times \frac{1}{3} = \frac{1}{72}241×31
=721.
o Note: Cannot multiply probabilities if events are dependent.
4. Adding Probabilities
o For events E1E1E1 and E2E2E2:
 Prob(E1 OR E2)=Prob(E1)+Prob(E2)−Prob(E1 AND E2)\text{Prob}(
E1 \text{ OR } E2) = \text{Prob}(E1) + \text{Prob}(E2) -
\text{Prob}(E1 \text{ AND }
E2)Prob(E1 OR E2)=Prob(E1)+Prob(E2)−Prob(E1 AND E2)
o If Prob(E1 AND E2)\text{Prob}(E1 \text{ AND } E2)Prob(E1 AND E2) is
unknown, then:
 Prob(E1 OR E2)≤Prob(E1)+Prob(E2)\text{Prob}(E1 \text{ OR } E2)
\leq \text{Prob}(E1) +
\text{Prob}(E2)Prob(E1 OR E2)≤Prob(E1)+Prob(E2).
VI. DNS (Domain Name System)
1. Overview
o DNS is a collection of servers worldwide.
o Input to DNS: A URL (e.g., coursera.org), which is a human-readable name
identifying an object.
2. Functionality
o Output from DNS: The IP address of a web server hosting the content, which
may not be human-readable.
o The IP address could refer to either the actual web server or an indirect server
(e.g., a CDN server).
VII. Graphs
 Introduction to graphs and their relevance in computing.
Cloud Computing
The Hype
o Predictions and statistics from industry experts about the growth and
significance of cloud computing:
 Gartner (2009): Cloud computing revenue expected to exceed $150
billion.
 IDC (2009): IT cloud services spending projected to triple in 5 years,
reaching $42 billion.
 Forrester (2010): Cloud computing growth from $40.7 billion in 2010
to $241 billion in 2020.
o Notable adoption of cloud computing by companies and government entities.
1. Virtualization: This technology allows a single physical machine to run multiple
virtual machines (VMs), each acting as an independent computer. Virtualization
enables efficient use of resources in cloud environments.
o Example: A server in a data center can host several virtual servers, each
running different applications for different users, improving resource
utilization.
2. Distributed Systems: A cloud is essentially a large distributed system where
computing resources are spread across multiple physical locations, all working
together as a unified system. This setup enables efficient handling of large-scale tasks.
o Example: Google’s search engine operates over distributed servers worldwide
to deliver search results rapidly.
3. Networking: The cloud relies heavily on networking protocols (like HTTP, TCP/IP)
to transfer data between users and cloud providers. Cloud resources are often accessed
through the internet, so knowledge of networking fundamentals is crucial for
understanding cloud communication.
o Example: When you access a website hosted on a cloud server, your request
travels over a network, reaching the cloud server, which then delivers the
content back to you.
4. Storage: In cloud computing, data is stored in distributed storage systems (like
Amazon S3 or Google Cloud Storage), which allows for fast access and redundancy.
o Example: Photos uploaded to Google Drive are stored across multiple servers
to ensure that even if one server fails, the data remains available.
5. Security: Cloud computing requires strong security protocols to protect sensitive
data. This includes encryption, firewalls, access control, and multi-factor
authentication.
o Example: A bank storing customer financial data in the cloud would encrypt
the data to ensure that only authorized users can access it.
By grasping these computer science fundamentals, learners will have the foundational
knowledge needed to understand the architecture, principles, and benefits of cloud
computing.
Introduction to Cloud Computing
Cloud computing is a model that allows users to access computing resources (such as servers,
storage, and applications) over the internet, instead of relying on local infrastructure. These
resources are provided by third-party cloud service providers, and users only pay for what
they use, making it a flexible and cost-effective solution for businesses of all sizes.
Why Clouds?
Cloud computing has revolutionized how IT resources are managed and delivered, bringing
several advantages:
1. Cost Savings: Organizations can eliminate the expense of buying and maintaining
hardware and software. They pay only for the services they use, lowering capital
expenditure.
2. Scalability: Resources can be scaled up or down easily to match business needs,
ensuring efficient use of resources without waste.
3. Accessibility: Cloud services are accessible from any location with internet access,
facilitating remote work and global collaboration.
4. Speed and Flexibility: New resources can be provisioned in minutes, allowing
businesses to innovate quickly and respond to market demands.
5. Disaster Recovery: Cloud services often include built-in backup and disaster
recovery features, protecting critical data from loss due to failures.
What is a Cloud?
A "cloud" is a network of remote servers hosted on the internet that
store, manage, and process data. Cloud services are typically delivered
through data centers owned and managed by third-party providers.
The cloud is essentially a distributed system, offering on-demand
access to computing power and storage.
• Cloud = Lots of storage + compute cycles nearby
• A cloud consists of
1. Hundreds to thousands of machines in a datacenter (server side)
2. Thousands to millions of machines accessing these services (client side)
• Servers communicate amongst one another.
• Clients communicate with servers
• Clients also communicate with each other
• A single-site cloud (aka “datacenter”) consists of

➢ Compute nodes (grouped into racks)

➢ Switches, connecting the racks


➢ A network topology, e.g., hierarchical

➢ Storage (backend) nodes connected to the network

➢ Front-end for submitting jobs and receiving client requests

➢ Software services
• A geographically distributed cloud consists of

➢ Multiple such sites

➢ Each site perhaps with a different structure and services

Key Components of Cloud Computing:


1. Cloud Infrastructure: The physical hardware (servers, networking equipment, data
centers) that supports cloud services.
2. Cloud Platforms: The tools and environments that allow developers to create, run,
and manage applications in the cloud.
3. Cloud Services: The services offered to users, typically categorized into three
models: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software
as a Service (SaaS).
History of Cloud Computing
 1960s: The concept of cloud computing originated from the idea of time-sharing,
where mainframe computers allowed multiple users to share processing power
simultaneously.
 1990s: The development of the internet laid the foundation for modern cloud
computing. The term "cloud" began to be used to describe the infrastructure of
telecommunications networks.
 2000s: With the rise of web-based services, companies like Amazon, Google, and
Microsoft started offering cloud computing services, allowing users to access storage,
databases, and processing power via the internet.
 Present: Cloud computing has become a mainstream technology, with many
businesses relying on it for everything from data storage to AI-driven applications.

MANY CLOUD PROVIDERS


• AWS: Amazon Web Services

➢ EC2: Elastic Compute Cloud


➢ S3: Simple Storage Service
➢ EBS: Elastic Block Storage
• Microsoft Azure
• Google Compute Engine
• Rightscale, Salesforce, EMC, Gigaspaces 10gen, Datastax, Oracle, VMWare, Yahoo,
Cloudera
TWO CATEGORIES OF CLOUDS
• Can be either a (i) public cloud, or (ii) private cloud
• Private clouds are accessible only to company employees
• Public clouds provide service to any paying customer:

➢ Amazon S3 (Simple Storage Service): store arbitrary datasets, pay per GB-month
stored
➢ Amazon EC2 (Elastic Compute Cloud): upload and run arbitrary OS images, pay
per CPU hour used
➢ Google App Engine/Compute Engine: develop applications within their App
Engine framework, upload data that will be imported into their format, and run.

TRENDS:
TECHNOLOGY
• Doubling periods — storage: 12 months, bandwidth: 9 months, and (what law is this?)
CPU compute capacity: 18 months
• Then and Now
➢ Bandwidth ❖ 1985: mostly 56Kbps links nationwide
❖ 2012: Tbps links widespread
➢ Disk capacity ❖ Today’ s PCs have TBs, far more than a 1990 supercomputer
TRENDS:
USERS
• Then and Now ➢ Biologists: ❖ 1990: were running small single- molecule simulations
❖ 2012: CERN' s Large Hadron Collider producing many PB/year
PROPHECIES
• In 1965, MIT’s Fernando Corbato and the other designers of the Multics operating system
envisioned a computer facility operating “like a power company or water company.”
• Plug your thin client into the computing utility and play your favorite Intensive Compute &
Communicate Application
➢ Have today’s clouds brought us closer to this reality? Think about it.

What's New in Today's Clouds?


Today’s cloud environments are much more advanced and feature-rich compared to earlier
versions, incorporating cutting-edge technologies and innovations:
1. Hybrid and Multi-Cloud Environments: Many businesses are now adopting a
combination of public and private clouds, along with multiple cloud service providers,
to increase flexibility and avoid vendor lock-in.
2. Edge Computing: Cloud services are extending to the "edge" of the network,
allowing data to be processed closer to the source (e.g., IoT devices), reducing latency
and improving performance.
3. Serverless Computing: Cloud platforms now offer serverless architectures, where
developers can deploy applications without worrying about managing the underlying
infrastructure. Cloud providers automatically allocate resources as needed.
4. AI and Machine Learning Integration: Modern clouds offer built-in machine
learning and AI tools, allowing businesses to analyze vast amounts of data and create
intelligent applications.
5. Containers and Kubernetes: Cloud platforms now support containers (e.g., Docker)
and orchestration tools (e.g., Kubernetes) to improve the scalability and portability of
applications.
I. MASSIVE SCALE
 Facebook (GigaOm, 2012):
o Increased from 30,000 servers in 2009 to 60,000 in 2010, and reached 180,000
by 2012.
 Microsoft (NYTimes, 2008):
o Operates 150,000 machines with a growth rate of 10,000 machines per month.
o Out of these, 80,000 machines are dedicated to running Bing.
 Yahoo! (2009):
o Maintained a fleet of 100,000 servers, organized into clusters of 4,000.
 AWS EC2 (Randy Bias, 2009):
o Comprises 40,000 machines, each equipped with 8 cores.
 eBay (2012):
o Managed 50,000 machines.
 HP (2012):
o Operated 380,000 machines across 180 data centers.
 Google:
o Utilizes a vast and unspecified number of machines.

II . ON-DEMAND ACCESS: *AAS CLASSIFICATION


On-demand services can be likened to renting a cab instead of the traditional approach of
renting or purchasing a car. Here are some examples:
 AWS Elastic Compute Cloud (EC2): Offers compute power at a cost ranging from a
few cents to several dollars per CPU hour.
 AWS Simple Storage Service (S3): Provides storage services priced at a few cents to
several dollars per GB per month.
HaaS: Hardware as a Service
 Provides access to barebones hardware machines, allowing you to utilize them for
your own needs, such as creating your own cluster.
 However, using HaaS can pose security risks.
IaaS: Infrastructure as a Service
 Grants access to flexible computing and storage infrastructure. Virtualization is one
method to achieve this, alongside others like using Linux. IaaS often encompasses
HaaS.
 Examples include Amazon Web Services (AWS: EC2 and S3), Eucalyptus,
Rightscale, and Microsoft Azure.
PaaS: Platform as a Service
 Offers flexible computing and storage infrastructure along with a tightly integrated
software platform.
 An example is Google’s App Engine, which supports programming languages like
Python, Java, and Go.
SaaS: Software as a Service
 Provides access to software services as needed, often described as encompassing
Service-Oriented Architectures (SOA).
 Examples include Google Docs and Microsoft Office on demand.

III. DATA-INTENSIVE COMPUTING


 Computation-Intensive Computing:
o Example Areas: Includes MPI-based systems, high-performance computing,
and grid computing.
o Typically executed on supercomputers, such as NCSA Blue Waters.
 Data-Intensive Computing:
o Focuses on storing large volumes of data at data centers.
o Utilizes compute nodes located nearby to process the data.
o Compute nodes are responsible for running various computational services.
In data-intensive computing, the emphasis shifts from computation to the data itself. As a
result, CPU utilization is no longer the primary resource metric; instead, I/O performance
(whether from disk or network) becomes the key focus.

IV. NEW CLOUD PROGRAMMING PARADIGMS


New cloud programming paradigms make it easier to write and execute highly parallel
programs. Some notable examples include:
 Google: Utilizes MapReduce and Sawzall.
 Amazon: Offers the Elastic MapReduce service, operating on a pay-as-you-go basis.
Google (MapReduce)
 Indexing: Involves a sequence of 24 MapReduce jobs.
 Processed approximately 200,000 jobs handling 50 petabytes of data per month in
2006.
Yahoo! (Hadoop + Pig)
 WebMap: Comprises a series of 100 MapReduce jobs.
 Managed 280 terabytes of data using 2,500 nodes, completing the task in 73 hours.
Facebook (Hadoop + Hive)
 Handled about 300 terabytes of total data, adding 2 terabytes daily in 2008.
 Executed around 3,000 jobs processing 55 terabytes per day.
Similar scales of data processing are reported by other companies, including Yieldex and
eharmony.com.
NoSQL
While MySQL remains an industry standard, alternatives like Cassandra are significantly
faster, achieving performance that is 2,400 times greater.

Introduction to Clouds: New Aspects of Clouds


Cloud computing is continuously evolving, and some of the new aspects include:
1. Artificial Intelligence and Automation: Many cloud platforms now integrate AI and
automation tools, allowing businesses to optimize resource allocation, improve
performance, and reduce costs without manual intervention.
2. Quantum Computing: Although still in its infancy, quantum computing is becoming
available as a cloud service, enabling organizations to experiment with powerful new
computing capabilities.
3. Edge and Fog Computing: These concepts bring computing power closer to the data
source (such as IoT devices) by processing data locally instead of sending it to a
centralized cloud, reducing latency and bandwidth costs.
4. Data Sovereignty and Compliance: Modern clouds are increasingly focused on
addressing data sovereignty issues, allowing organizations to store data in specific
geographic regions to comply with local regulations.
5. Security Enhancements: As security threats become more sophisticated, cloud
providers are continuously improving encryption techniques, multi-factor
authentication, and automated threat detection.
In short, today's cloud platforms are more flexible, scalable, and feature-rich than ever,
supporting a wide range of applications from basic storage to AI and machine learning.
Economics of Clouds
Cloud economics is the study of the financial aspects of cloud computing. It involves
understanding the costs, benefits, and principles associated with using cloud services. Here
are some key points:
1. Total Cost of Ownership (TCO):
o Cloud computing can reduce the TCO by eliminating the need for physical
hardware and maintenance.
o Costs include subscription fees for cloud services, which can be more
predictable and scalable compared to traditional IT infrastructure.
2. Cost Optimization:
o Cloud providers offer tools and strategies to optimize costs, such as auto-
scaling, which adjusts resources based on demand.
o Pay-as-you-go models allow businesses to pay only for the resources they use,
reducing waste.
3. Economic Benefits:
o Scalability: Easily scale resources up or down based on demand.
o Flexibility: Access to a wide range of services and tools without significant
upfront investment.
o Innovation: Faster deployment of new applications and services, fostering
innovation.
Economic Benefits of Cloud Computing:
1. Pay-as-you-go Model: Businesses pay for the exact amount of computing resources
they consume, rather than investing in expensive hardware that may sit idle.
o Example: A company using Amazon Web Services (AWS) only pays for the
storage and compute power they use, avoiding upfront capital costs.
2. Operational Efficiency: Cloud platforms offer automatic updates and maintenance,
reducing the need for large IT teams and infrastructure management. This minimizes
operational costs.
o Example: Software-as-a-Service (SaaS) applications like Salesforce eliminate
the need for companies to manage software installations and updates.
3. Scalability and Flexibility: The cloud allows companies to quickly scale up or down
based on demand, ensuring they never overpay for unused resources or face
bottlenecks due to limited capacity.
o Example: E-commerce platforms can scale up their server capacity during
peak shopping seasons and reduce it afterward, optimizing resource usage and
costs.
4. Reduced Total Cost of Ownership (TCO): Cloud computing reduces the total cost
of ownership by eliminating capital expenses for hardware, software, and IT
infrastructure. Additionally, the cloud provider handles upgrades, security patches,
and disaster recovery.
5. Global Reach: Businesses can deploy applications and services worldwide without
the need to establish physical infrastructure in multiple regions, saving on logistics
and operational costs.
6. Improved Cash Flow: Since cloud computing is based on operational expenditure
(OpEx), rather than capital expenditure (CapEx), businesses can better manage cash
flow, as they don’t need to invest large amounts upfront.
Understanding Clouds
Cloud computing can be categorized into two types:
1. Public Cloud: Accessible to any paying customer.
2. Private Cloud: Restricted to company employees.
If you're starting a new service or company, a key decision is whether to utilize a public cloud
or invest in a private cloud.
Single Site Cloud: To Outsource or Own?
Consider a medium-sized organization planning to run a service for M months that requires:
 128 servers (1,024 cores)
 524 TB of storage
This setup is similar to the UIUC CCT (University of Illinois at Urbana-Champaign, Cloud
Computing Technologies) cloud site.
Outsourcing Costs (e.g., via AWS):
 S3 storage costs: $0.12 per GB per month.
 EC2 compute costs: $0.10 per CPU hour (based on 2009 rates).
Calculating the costs:

Owning Costs:
 Storage: approximately $349,000/month.
 Total cost: approximately $1,555,000/month plus $7,500 (for 1 system administrator
per 100 nodes), considering a cost split of 0.45 for hardware, 0.4 for power, and 0.15
for network, with a 3-year hardware lifespan.
Breakeven Analysis

Breakeven Points:
 M>5.55 months (storage)
 M>12 months (overall)
Conclusion:
 Startups tend to rely heavily on cloud services.
 Cloud providers derive significant revenue from storage services.
Summary
Cloud computing builds on previous generations of distributed systems, particularly the
timesharing and data processing industries of the 1960s and 70s. It is essential to identify the
unique aspects of a problem to classify it as a new cloud computing challenge, considering
factors such as scale, on-demand access, data intensity, and new programming models.
Otherwise, existing solutions may already address your needs.
A Cloud IS a Distributed System
At its core, a cloud is a type of distributed system, where computing resources, data storage,
and applications are distributed across multiple physical locations (servers or data centers)
but are accessible as a unified system over the internet.

Key Features of Cloud as a Distributed System:


1. Multiple Locations: The cloud’s resources are spread across various geographic
regions and data centers, but they work together to serve users seamlessly.
2. Resource Sharing: Multiple users and organizations share the cloud’s resources, with
each having access to isolated environments through virtualization, ensuring
efficiency and security.
3. Fault Tolerance: Cloud systems are designed to handle failures by distributing data
and workloads across multiple machines. If one server fails, others take over, ensuring
continuous availability.
4. Scalability: Distributed cloud systems can scale resources dynamically based on
demand. Users can add or remove computing power, storage, and other resources as
needed.
What is a Distributed System?
A distributed system is a collection of independent computers (nodes) that work together to
provide a unified, coherent service to users. These computers communicate over a network
and coordinate to achieve a common goal, appearing as a single system to the end-user.
Characteristics of Distributed Systems:
1. Multiple Independent Nodes: A distributed system consists of multiple independent
computers that collaborate to perform tasks. These nodes can be located in different
physical locations.
o Example: Google Search is powered by thousands of servers located around
the world, all working together to deliver search results in milliseconds.
2. Concurrency: Multiple nodes can perform tasks simultaneously, improving system
performance and processing large workloads faster.
o Example: In a distributed system handling large-scale web requests, different
nodes can serve different users at the same time, enabling high concurrency.
3. Fault Tolerance: If one or more nodes in a distributed system fail, the system
continues to function, often without the user noticing. This is achieved through
replication, redundancy, and automatic failover mechanisms.
o Example: If a node in Amazon's cloud infrastructure goes down, the workload
can be transferred to another node, preventing service disruption.
4. Scalability: Distributed systems can scale horizontally, meaning additional nodes can
be added to handle more tasks or data. This allows systems to grow without major
architectural changes.
o Example: Facebook's infrastructure scales across thousands of servers to
handle millions of users simultaneously.
5. Resource Sharing: In a distributed system, nodes share resources such as processing
power, data storage, and network bandwidth, optimizing efficiency and resource
utilization.
o Example: Cloud file storage like Google Drive stores files across multiple
servers in different locations, making them accessible from anywhere.
6. Coordination via Communication: Nodes in a distributed system communicate and
coordinate their actions to function as a single unit. This communication occurs via
network protocols like TCP/IP or HTTP.
o Example: In a distributed database, different nodes synchronize to ensure that
data is consistent across all locations, even when changes are made.
The economics of clouds make it an attractive solution for businesses looking to reduce
costs, increase flexibility, and scale effortlessly. Cloud computing is fundamentally a
distributed system, where resources are spread across various locations and users access
them seamlessly over the internet. By understanding the principles of distributed systems,
such as scalability, fault tolerance, and resource sharing, one can better appreciate how the
cloud operates to provide reliable, efficient services.
MapReduce Paradigm
The MapReduce paradigm is a programming model designed for processing large data sets
across a distributed system. Developed by Google, MapReduce simplifies the process of
analyzing huge amounts of data by breaking the task into two phases: Map and Reduce.

1. Map Phase:
o The input data is split into smaller, manageable chunks.
o A user-defined Map function processes each chunk independently and
produces intermediate key-value pairs.
2. Shuffle and Sort:
o After the Map phase, the key-value pairs are shuffled and sorted based on their
keys. This step ensures that all values associated with the same key are
grouped together for the Reduce phase.
3. Reduce Phase:
o The intermediate key-value pairs are then passed to the Reduce function,
which processes the data to produce the final result.
This model allows parallel processing across multiple machines, making it highly scalable
and efficient for large datasets.
Example:
Let’s say you want to count the occurrences of words in a large set of documents.
 Map Phase: Each document is split into words, and for each word, the Map function
emits a key-value pair (word, 1).
 Shuffle and Sort: All occurrences of the same word are grouped together.
 Reduce Phase: The Reduce function sums the values for each word, yielding the total
count of each word across the documents.
PROGRAMMING MAPREDUCE
Externally (User Perspective)
1. Write Programs: Create a short Map program and a short Reduce program.
2. Submit Job: Submit the job and wait for the results.
3. No Need for Expertise: Users don’t need to understand parallel or distributed
programming.
Internally (For the Paradigm and Scheduler)
1. Parallelize Map: Distribute the Map tasks across available nodes.
2. Data Transfer: Move data from the Map phase to the Reduce phase.
3. Parallelize Reduce: Distribute the Reduce tasks across nodes.
4. Storage Implementation: Manage storage for:
o Map input
o Map output
o Reduce input
o Reduce output
Ensure that no Reduce task starts until all Map tasks are completed,
maintaining a barrier between the Map and Reduce phases.
INSIDE MAPREDUCE
For the Cloud
1. Parallelize Map: This is straightforward since each map task operates independently.
All Map output records with the same key are assigned to the same Reduce task.
2. Data Transfer from Map to Reduce:
o All Map output records with the same key are directed to the corresponding
Reduce task.
o This is managed using a partitioning function, such as hash(key) % number of
reducers.
3. Parallelize Reduce: Similarly easy, as each Reduce task is independent of others.
4. Storage Management:
o Map Input: Sourced from a distributed file system.
o Map Output: Written to local disk on the Map node, utilizing the local file
system.
o Reduce Input: Retrieved from multiple remote disks, again using local file
systems.
o Reduce Output: Written back to a distributed file system.
o Local File Systems: Examples include Linux File System (FS) and others.
o Distributed File Systems: Examples include GFS (Google File System) and
HDFS (Hadoop Distributed File System)

Example : Programming for word count


MapReduce Examples
1. Word Count Example:
o Input: A large collection of text documents.
o Map: For each word in the document, emit (word, 1).

o Reduce: Sum the values associated with each word, producing the total count
for each word in the dataset.
Output: The word count for each word.
2. Sorting Example:
o Input: A large unsorted dataset.
o Map: Break the dataset into chunks and assign each value to a key
representing its sort order.
o Reduce: Merge the sorted chunks to produce a fully sorted dataset.
Output: A sorted list of data.
3. Distributed Grep (Pattern Matching):
o Input: A set of documents and a regular expression.
o Map: For each document, check if the document contains the pattern. If yes,
emit (document ID, content).
o Reduce: Collect the matching documents.
Output: A list of documents matching the pattern.

MapReduce Scheduling
Scheduling in MapReduce is the process of determining how and where the individual tasks
(Map and Reduce tasks) should be executed in a distributed system to optimize resource
utilization and job completion time.
1. JobTracker and TaskTracker:
o JobTracker: Manages the execution of MapReduce jobs. It splits the job into
Map and Reduce tasks and schedules them on worker nodes (TaskTrackers).
o TaskTracker: Runs Map or Reduce tasks and reports the status back to the
JobTracker.
2. Locality-Based Scheduling:
o A critical component of MapReduce scheduling is placing tasks on nodes that
already contain the input data (data locality). This reduces data transfer across
the network, improving job performance.
o The JobTracker tries to schedule tasks on nodes that contain the data or, at the
very least, nodes that are close to the data source in the network.
3. Speculative Execution:
o Sometimes, tasks may get delayed due to hardware issues or other
inefficiencies. MapReduce uses speculative execution to mitigate this by
running backup tasks on other nodes if a task is running slower than expected.
This ensures faster job completion by reducing the impact of stragglers.
Example: The YARN Scheduler
 Framework: YARN (Yet Another Resource Negotiator) is used in Hadoop 2.x and
later versions.
 Resource Management: YARN views each server as a collection of containers,
where a container consists of a specified amount of CPU and memory.
Key Components
1. Global Resource Manager (RM):
o Responsible for overall scheduling and resource allocation across the cluster.
2. Per-Server Node Manager (NM):
o Manages daemon processes and handles server-specific functions, including
monitoring resource usage on each node.
3. Per-Application (Job) Application Master (AM):
o Manages container negotiations with both the Resource Manager and the Node
Managers.
o Responsible for detecting task failures related to its specific job.

MapReduce Fault-Tolerance
Fault tolerance in MapReduce is a crucial feature that ensures the system can recover from
failures without losing data or results.
1. Task-Level Fault Tolerance:
o If a Map or Reduce task fails (due to a node crash, network issues, etc.), the
task is automatically re-executed on another node. The system keeps track of
the progress of each task, so only the failed tasks need to be re-executed.
2. Checkpointing:
o During the execution of tasks, intermediate results are periodically saved
(checkpointed) to disk. This way, if a task fails, it can resume from the last
checkpoint instead of starting over from scratch.
3. Data Replication:
o The underlying storage system (e.g., HDFS in Hadoop) replicates input data
across multiple nodes. If one node fails, the data can be retrieved from other
nodes, ensuring that Map tasks always have access to the input data.
4. Master Failure Recovery:
o While TaskTrackers handle individual tasks, the JobTracker manages the
overall execution. In case the JobTracker fails, the job must be restarted, but in
modern MapReduce implementations (like YARN in Hadoop), fault tolerance
mechanisms ensure that the JobTracker can be restarted without job loss.
Example: If a node running a Map task crashes midway, the system detects the failure,
retrieves the data from a replica, and reassigns the task to another node. The job continues
with minimal disruption.

You might also like