Cloud Computing - 1
Cloud Computing - 1
CO3: Identify the architecture, delivery models and industrial platforms for cloud computing
based applications.
CO5: Demonstrate cloud applications in various fields using suitable cloud platforms.
Module-1
Distributed System Models and Enabling Technologies: Scalable Computing Over the
Internet, Technologies for Network Based Systems, System Models for Distributed and Cloud
Computing, Software Environments for Distributed Systems and Clouds, Performance, Security
and Energy Efficiency.
• Machine Architecture:
o Transition from mainframes (centralized) to personal computers (PCs) and,
eventually, to server farms and data centers.
o Emergence of multicore processors, GPUs, and specialized accelerators like TPUs
(Tensor Processing Units) for handling specific tasks like AI and machine
learning.
• Operating Systems:
o From simple, single-user OS (like MS-DOS) to complex, multi-user and multi-
tasking OS (like UNIX, Linux, Windows Server).
o Introduction of virtualization technologies that allow multiple operating systems
to run on a single physical machine.
• Network Connectivity:
o Early standalone computers evolved into networked environments, leading to the
development of the Internet.
o Transition from local area networks (LANs) to wide area networks (WANs) and
global connectivity through the Internet.
o Advent of high-speed networking technologies (fiber optics, 5G) that enable
faster data transfer and lower latency.
Characteristics:
• Data-Intensive Applications:
o Big Data Analytics: Large-scale data processing systems like Hadoop and Spark.
o AI and Machine Learning: Distributed training of machine learning models across
multiple nodes.
• Network-Centric Applications:
o Content Delivery Networks (CDNs): Distribute content (e.g., videos, images)
closer to users to improve loading speeds.
o Online Gaming: Real-time multiplayer gaming that requires low latency and high
data transfer rates.
• Social Media Platforms:
o Highly scalable systems handling billions of user interactions, real-time updates,
and large datasets.
• Overview: The Internet has become an integral part of daily life for billions of people
worldwide. This explosion in Internet usage has created a massive demand for computing
resources that can handle large-scale, concurrent user activities.
• Shift in Computing Needs: Traditional high-performance computing (HPC)
benchmarks, like the Linpack Benchmark, are becoming less relevant as the focus shifts
from purely computational performance to managing vast amounts of data and numerous
simultaneous tasks.
• HPC Overview:
o Designed primarily for scientific and engineering tasks that require significant
computational power, such as simulations, modeling, and complex calculations.
o Measured using benchmarks like Linpack, which focuses on floating-point
operations per second (FLOPS).
• Definition of HTC:
o High-Throughput Computing focuses on maximizing the total number of tasks
completed over a given time rather than just peak performance.
o Utilizes parallel and distributed computing technologies to process numerous
independent tasks simultaneously, ideal for large-scale Internet applications.
• Key Characteristics:
o Scalability: Ability to add more nodes or resources to handle increased
workloads.
o Concurrency: Designed to manage many tasks or users at once, critical for
services like web hosting, cloud storage, and social media platforms.
o Data-Driven Performance: Emphasizes data processing speed, latency
reduction, and high input/output operations per second (IOPS).
• Computing technology has evolved through five distinct generations, with each
generation bringing new advancements that reshaped how we use computers. Each
generation lasted approximately 10 to 20 years, often overlapping with the next.
• The evolution reflects the increasing complexity, capability, and accessibility of
computing systems, moving from centralized mainframes to highly distributed and
interconnected systems.
• HPC Systems:
o Evolution: Supercomputers with massively parallel processors (MPPs) have been
gradually replaced by clusters of cooperative computers.
o Clusters: Composed of homogeneous compute nodes physically connected in
close proximity, providing high-speed communication and shared resources.
o Focus: Primarily used for scientific simulations, modeling, and tasks requiring
immense computational power.
• HTC Systems:
o HTC Overview: Focuses on completing a large number of tasks, often
independent, over a distributed network of nodes. Prioritizes throughput over peak
performance.
o Peer-to-Peer (P2P) Networks: Enable distributed file sharing and content
delivery applications. These systems are built over many globally distributed
client machines, emphasizing a decentralized approach.
o Applications: HTC systems are extensively used in cloud computing, web
services, and P2P platforms, where the emphasis is on handling vast amounts of
data and numerous concurrent user requests.
• Clustering: The use of clusters, which are groups of interconnected computers that work
together as a single system to perform computational tasks efficiently.
• Grid Computing: Extends the concept of clusters to geographically dispersed networks,
forming computational grids that pool resources from multiple locations.
• P2P Networks: Decentralized networks that allow direct sharing of resources among
peers, often used in applications like file sharing, content delivery, and collaborative
computing.
• Cloud Computing: Provides scalable and on-demand access to computing resources
over the Internet, facilitating both HTC and HPC applications with flexible, pay-as-you-
go models.
• Integration of HPC and HTC Systems: While traditionally separate, there is a growing
overlap as HPC tasks are increasingly executed on cloud platforms, and HTC systems
incorporate high-performance elements for specific workloads.
• Rise of Data-Centric Computing: Emphasis is shifting from purely computational tasks
to data-driven approaches, leveraging vast datasets and advanced analytics to drive
insights and decisions.
• Emergence of Web Services and APIs: Many modern applications are built on services
that integrate easily with other platforms, allowing for more complex, distributed, and
interoperable computing environments.
6. Future Directions:
• Edge Computing: Bringing computing resources closer to the data source, enhancing
real-time processing and reducing latency, especially important for IoT applications.
• Quantum Computing: Potentially revolutionary, offering exponentially greater
processing power for certain types of complex problems that are infeasible for classical
computers.
1. Overview of HPC:
• Definition: HPC refers to the use of powerful supercomputers and computing clusters to
solve complex computational problems that require immense processing speed.
• Focus on Speed: For years, HPC systems have prioritized raw speed performance,
measured in floating-point operations per second (FLOPS).
• Evolution of Performance:
o Early 1990s: HPC systems operated at speeds measured in gigaflops (Gflops).
• Linpack Benchmark: The primary metric for measuring HPC performance, focusing on
the system’s ability to solve large sets of linear equations.
• Top 500 Supercomputers: A widely recognized list ranking the world's most powerful
computing systems based on Linpack benchmark results.
• Despite their capabilities, supercomputers serve a niche market, with fewer than 10% of
all computer users accessing these systems.
• Primary Users: Mostly scientists, engineers, and researchers working on complex
simulations, data analysis, and modeling tasks.
1. Overview of HTC:
• Definition: HTC emphasizes completing a large number of tasks efficiently over a given
period, focusing on high-flux computing to manage vast workloads.
• High Throughput: Measures system performance by the number of tasks completed per
unit of time, rather than peak computational speed.
• Scalability: Designed to support millions of users and tasks concurrently, making it ideal
for applications like Internet searches and cloud-based services.
• Cost Efficiency: HTC systems aim to reduce operational costs in data centers by
optimizing resource utilization.
• Energy Savings: Focus on minimizing power consumption through efficient hardware
and software design, critical for large-scale, high-demand environments.
• Security and Reliability: Enhances data protection and system stability, crucial for
enterprise-level computing and data management.
4. Applications of HTC:
5. Broader Impact:
• HTC systems are increasingly important for meeting the computing demands of everyday
users, bridging the gap between specialized high-performance computing and general-
purpose, market-driven applications.
• Definition: SOA is a framework that allows different services to communicate and work
together over a network, enabling Web 2.0 applications.
• Impact: Facilitates modular, interoperable software that can be easily integrated and
reused, making it essential for modern web services.
• Details Covered: SOA will be explored in detail in Chapter 5.
2. Cloud Computing:
• Definition: IoT connects physical devices using technologies like RFID, GPS, and
sensors, allowing them to collect and exchange data.
• Impact: Transforms how devices interact with each other and with users, integrating
cyber-physical systems (CPS) into everyday life.
• Details Covered: IoT and CPS will be discussed in Chapter 9.
• Historical Context:
o The Internet’s inception in 1969 set the stage for the growth of networked
computing, envisioned as a utility similar to electricity or telephone services.
• Blurring Lines:
o The distinctions between clusters, grids, P2P systems, and clouds are increasingly
blurred, as these technologies integrate and evolve.
• Future Trends:
o Clouds are expected to process vast datasets from the traditional Internet, social
media, and IoT, pushing the boundaries of distributed and cloud computing
models.
1. Centralized Computing:
o Definition: All computer resources (processors, memory, storage) are centralized
in a single physical system.
o Characteristics:
▪ Resources are fully shared within one integrated operating system.
▪ Common in data centers and supercomputers.
▪ Tightly coupled hardware components.
o Use Cases: Often used in parallel, distributed, and cloud computing.
2. Parallel Computing:
o Definition: A computing paradigm where multiple processors work
simultaneously on different parts of a task.
o Characteristics:
▪ Processors can be tightly coupled (centralized shared memory) or loosely
coupled (distributed memory).
▪ Interprocessor communication via shared memory or message passing.
1. P2P Networks:
o Definition: Peer-to-peer (P2P) networks consist of distributed nodes (machines)
that communicate and share resources without relying on a centralized server.
o Example: File-sharing systems like BitTorrent.
o Scale: Can involve millions of client machines working simultaneously.
1. Efficiency:
o HPC: Maximize the utilization of resources by exploiting parallelism.
o HTC: Focus on job throughput, optimizing data access, and power efficiency
(throughput per watt).
2. Dependability:
o Ensure reliability and self-management across the system, even in failure
conditions.
o Provide Quality of Service (QoS) assurances at the application and system levels.
3. Adaptation:
o Support billions of job requests across massive data sets.
o Efficiently manage virtualized cloud resources under varying workloads and
service models.
4. Flexibility:
o Distributed systems must be capable of running both HPC (scientific and
engineering) and HTC (business) applications effectively.
Several predictable technology trends continue to shape the evolution of scalable computing
applications. Designers and programmers attempt to forecast the capabilities of future systems to
meet the growing demand for distributed and parallel processing.
• Moore’s Law:
• Gilder’s Law:
• Definition: The assertion that network bandwidth has doubled annually in the past.
• Definition: The parallel execution of large, independent jobs across multiple computing
nodes or machines in a distributed environment.
First, they are all ubiquitous in daily life. Reliability and scalability are two major design
objectives in these computing models.
Second, they are aimed at autonomic operations that can be self-organized to support dynamic
dis covery.
Finally, these paradigms are composable with QoS and SLAs (service-level agreements). These
paradigms and their attributes realize the computer utility vision.
Utility computing focuses on a business model in which customers receive computing resources
from a paid service provider.
The Internet of Things (IoT) refers to the networked interconnection of everyday objects and
devices, introduced in 1999 at MIT. Unlike the traditional Internet, which connects machines or
web pages, IoT enables the connection of physical objects through technologies like RFID and
GPS. With IPv6, there are enough IP addresses to uniquely identify all objects on Earth,
supporting the tracking of up to 100 trillion items.
IoT allows for communication in three patterns: human-to-human (H2H), human-to-thing (H2T),
and thing-to-thing (T2T). These connections can occur anytime and anywhere, facilitating
dynamic interactions between people and devices. Although still in its early stages, IoT is
expected to grow into a global network of interconnected objects, with cloud computing
supporting efficient and intelligent exchanges. The IoT aims to create a "smart Earth" with
intelligent cities, clean water, efficient transportation, and more, although achieving this vision
will take time.
Cyber-physical systems (CPS) integrate computational processes ("cyber") with physical objects
and environments. They combine the "3C" technologies—computation, communication, and
control—into intelligent feedback systems that bridge the physical and digital worlds. While the
Internet of Things (IoT) focuses on networking physical objects, CPS emphasizes the interaction
between the virtual and physical worlds, often through applications like virtual reality (VR). CPS
has the potential to revolutionize how we engage with the physical world, much like the Internet
transformed virtual interactions.
Modern CPUs have evolved into multicore architectures, featuring dual, quad, or more
processing cores, each capable of executing multiple instruction threads. These processors
exploit instruction-level parallelism (ILP) and task-level parallelism (TLP). Multicore
processors, like Intel’s i7 and AMD’s Opteron, include private L1 caches and shared L2 caches,
with potential future integration of L3 caches. Many-core GPUs, with hundreds to thousands of
cores, also leverage data-level parallelism (DLP). While processor speed has drastically
increased over the years, clock rates have hit a limit near 5 GHz due to power and heat
limitations, requiring innovation in chip design.
The future of multicore CPUs is expected to see an increase in core counts from tens to
potentially hundreds, but their ability to exploit massive data-level parallelism (DLP) is limited
by memory wall issues. This has led to the rise of many-core GPUs, which feature hundreds of
thin cores designed for high performance. Both IA-32 and IA-64 architectures are incorporated
into commercial CPUs, with x86 processors increasingly utilized in high-performance computing
(HPC) and high-throughput computing (HTC) systems. The trend shows a shift from RISC
processors to multicore x86 and many-core GPU systems in top supercomputers. Future
developments may include asymmetric or heterogeneous chip multiprocessors that integrate both
fat CPU cores and thin GPU cores on the same chip.
Optimization Differences:
• HPC Applications: NVIDIA’s CUDA Tesla and Fermi architectures are used in GPU
clusters for parallel processing of large floating-point datasets.
• GPU Programming Model Figure 1.7 shows the interaction between a CPU and GPU in
performing parallel execution of floating-point operations concurrently.
• The CPU is the conventional multicore processor with limited parallelism to exploit.
• The GPU has a many-core architecture that has hundreds of simple processing cores
organized as multiprocessors. Each core can have one or more threads.
• Essentially, the CPU’s floating-point kernel computation role is largely offloaded to the
many-core GPU. The CPU instructs the GPU to perform massive data processing.
• The bandwidth must be matched between the on-board main memory and the on-chip
GPU memory.
• This process is carried out in NVIDIA’s CUDA programming using the GeForce 8800 or
Tesla and Fermi GPUs.
Bill Dally of Stanford University considers power and massive parallelism as the major benefits
of GPUs over CPUs for the future. By extrapolating current technology and computer
architecture, it was estimated that 60 Gflops/watt per core is needed to run an exaflops system
(see Figure 1.10). Power constrains what we can put in a CPU or GPU chip. Dally has estimated
that the CPU chip consumes about 2 nJ/instruction, while the GPU chip requires 200
pJ/instruction, which is 1/10 less than that of the CPU. The CPU is optimized for latency in
caches and memory, while the GPU is optimized for throughput with explicit management of on-
chip memory. Figure 1.9 compares the CPU and GPU in their performance/power ratio measured
in Gflops/ watt per core. In 2010, the GPU had a value of 5 Gflops/watt at the core level,
compared with less than 1Gflop/watt per CPU core.
Beyond 2011, disk drives and disk arrays surpassed 3TB in capacity. The lower curve in Figure
1.10 reflects a seven-order-of-magnitude growth in disk storage over 33 years. Flash memory
and solid-state drives (SSDs) have also grown rapidly, significantly impacting the future of high-
performance computing (HPC) and high-throughput computing (HTC) systems. SSDs have a
relatively low mortality rate, with each block capable of handling between 300,000 and 1 million
write cycles, allowing them to last several years even with heavy write usage. SSDs and flash
memory will provide significant speed improvements in many applications.
However, large system development will eventually be constrained by factors such as power
consumption, cooling, and packaging. Power consumption increases linearly with clock
In small clusters, the nodes are typically interconnected via an Ethernet switch or a local area network
(LAN). Figure 1.11 illustrates that LANs are commonly used to connect client hosts to large servers. A
storage area network (SAN) connects servers to network storage systems such as disk arrays, while
network-attached storage (NAS) links client hosts directly to disk arrays. All three types of networks—
LAN, SAN, and NAS—are often found in large clusters built using commercial network components. For
smaller clusters without distributed storage, a setup can be created using a multiport Gigabit Ethernet
switch and copper cables to connect the machines. These network types are commercially available and
widely used.
The lower curve in Figure 1.10 highlights the rapid growth of Ethernet bandwidth, which
increased from 10 Mbps in 1979 to 1 Gbps in 1999, and reached 40-100 Gbps by 2011.
Predictions suggested that 1 Tbps network links could become available by 2013. In 2006,
Berman, Fox, and Hey reported network link bandwidths of 1,000 Gbps for international
Network performance was reported to double every year, a rate that outpaces Moore’s law for
CPU performance, which doubles every 18 months. This trend indicates that more computers
will be used concurrently, leading to the development of massively distributed systems.
According to the IDC 2010 report, both InfiniBand and Ethernet were predicted to remain the
dominant interconnect technologies in the high-performance computing (HPC) arena. Most data
centers have adopted Gigabit Ethernet to interconnect their server clusters.
A conventional computer uses a single operating system (OS) image, leading to a rigid structure
that tightly couples application software with specific hardware. This makes it difficult for
software to run on different machines with varying instruction sets or OS environments. Virtual
machines (VMs) solve these issues by improving resource utilization, application flexibility,
software management, and security.
For building large clusters, grids, and cloud environments, significant computing, storage, and
networking resources must be virtualized and aggregated to form a unified system image. Cloud
computing, in particular, relies on the dynamic virtualization of processors, memory, and I/O. Key
concepts like VMs, virtual storage, and virtual networking, along with their virtualization software, are
essential for operating modern large-scale systems. Figure 1.12 visually represents different VM
architectures.
In Figure 1.12, the host machine contains physical hardware, such as an x86 desktop running
Windows OS (as shown in part (a)). A Virtual Machine (VM) can be created on any hardware
system, with virtual resources managed by a guest OS to run specific applications. Between the
VM and the host, a middleware layer known as the Virtual Machine Monitor (VMM) is required.
Multiple VMs can be run on the same hardware system, offering hardware independence for the
OS and applications. A VM can run on a different OS than the host, providing portability and
flexibility for running applications across various platforms.
The VMM provides the VM abstraction to the guest OS. With full virtualization, the
VMM exports a VM abstraction identical to the physical machine so that a standard OS such as
Windows 2000 or Linux can run just as it would on the physical hardware. Low-level VMM
operations are indicated by Mendel Rosenblum [41] and illustrated in Figure 1.13.
• Second, a VM can be suspended and stored in stable storage, as shown in Figure 1.13(b).
• Finally, a VM can be migrated from one hardware platform to another, as shown in Figure
1.13(d).
Virtual Infrastructures Physical resources for compute, storage, and networking at the bottom of
Figure 1.14 are mapped to the needy applications embedded in various VMs at the top.
Hardware and software are then sepa rated. Virtual infrastructure is what connects resources to
distributed applications. It is a dynamic mapping of system resources to specific applications.
The result is decreased costs and increased efficiency and responsiveness. Virtualization for
server consolidation and containment is a good example of this.
A large data center may be built with thousands of servers. Smaller data centers are typically
built with hundreds of servers. The cost to build and maintain data center servers has increased
over the years.Accordingtoa2009IDCreport (see Figure 1.14), typically only 30 percent of data
center costs goes toward purchasing IT equipment (such as servers and disks), 33 percent is
attributed to the chiller, 18 percent to the uninterruptible power supply (UPS),9percent to
computer room air conditioning (CRAC), and the remaining 7 percent to power distribution,
lighting, and transformer costs. Thus, about 60 percent of the cost to run a data center is allocated
to management and maintenance. The server purchase cost did not increase much with time. The
cost of electricity and cooling did increase from 5 percent to 14 percent in 15 years.
High-end switches or routers may be too cost-prohibitive for building data centers. Thus, using
high-bandwidth networks may not fit the economics of cloud computing. Given a fixed budget,
commodity switches and networks are more desirable in data centers. Similarly, using
commodity x86 servers is more desired over expensive mainframes. The software layer handles
network traffic balancing, fault tolerance, and expandability. Currently, nearly all cloud
computing data centers use Ethernet as their fundamental network technology.
Data Deluge: Jim Gray highlighted the challenge of managing and analyzing the massive influx
of data from sensors, experiments, simulations, archives, and the web. This "data deluge"
demands new tools for data preservation, movement, access, and analysis, including scalable file
systems, databases, algorithms, workflows, and visualization techniques.
Impact on Science: The shift towards data-centric science (e-science) is creating a new
paradigm of discovery through data-intensive technologies. Cloud computing enables the capture
and analysis of vast data sets, supporting interdisciplinary research across fields like biology,
chemistry, physics, and social sciences.
MapReduce: At the platform level, the MapReduce programming model allows for easy data
parallelism and fault tolerance, which is essential for handling large-scale data processing in the
cloud. Iterative MapReduce extends these capabilities to support more complex data mining
algorithms, crucial for scientific applications.
Distributed and cloud computing systems consist of numerous autonomous computer nodes,
interconnected via Storage Area Networks (SANs), Local Area Networks (LANs), or Wide Area
Networks (WANs) in a hierarchical manner. Modern networking technology allows a few LAN
switches to easily connect hundreds of machines into a working cluster. WANs can connect
multiple local clusters to form a larger "cluster of clusters," creating massive systems with
potentially millions of computers connected to edge networks.
These large-scale systems are considered highly scalable, capable of achieving web-scale
connectivity both physically and logically. Table 1.2 classifies massive systems into four
categories:
1. Clusters
2. Peer-to-Peer (P2P) Networks
3. Computing Grids
4. Internet Clouds
These systems can range from hundreds to millions of computers, with nodes participating
collaboratively or cooperatively. The classification also considers various technical and
application aspects, highlighting how these systems function collectively to achieve distributed
computing tasks at different scales.
Figure 1.15 shows A cluster architecture consists of server nodes connected through a low-
latency, high-bandwidth interconnection network. This network can be a Storage Area Network
(SAN) like Myrinet or a Local Area Network (LAN) like Ethernet. To scale the cluster with
more nodes, the interconnection can be built hierarchically using multiple levels of Gigabit
Ethernet, Myrinet, or InfiniBand switches.
In most clusters, the node computers are loosely coupled, meaning each node's resources are
independently managed by its own operating system (OS), resulting in multiple system images.
Each autonomous node operates under its OS, so the cluster doesn't share a single system image,
but instead, the nodes work together while retaining individual OS control.
A Single-System Image (SSI) is an ideal concept in cluster design, as noted by Greg Pfister. The
goal of SSI is to merge multiple system images into one cohesive unit, allowing users to interact
with the cluster as if it were a single machine.
1. Unified Resource Management: SSI enables sharing of CPUs, memory, and I/O across
all nodes in the cluster, presenting them as an integrated resource.
2. User Transparency: It creates an illusion for users, who see the cluster as one powerful
system rather than a collection of independent computers.
Without SSI, a cluster with multiple system images functions merely as a group of independent
computers, lacking the seamless integration that SSI provides. This integrated approach enhances
the usability and performance of clustered systems, making them more effective for
computational tasks.
1. Building Blocks:
o Computer Nodes: These can be PCs, workstations, servers, or Symmetric
Multiprocessing (SMP) systems.
o Communication Software: Essential software such as PVM (Parallel Virtual
Machine) or MPI (Message Passing Interface) facilitates communication
among nodes.
o Network Interface Cards: Each node requires a network interface card to
connect with other nodes.
2. Operating System:
o Most HPC clusters operate under Linux OS, which is favored for its performance
and flexibility in managing resources.
3. High-Bandwidth Interconnection:
o Nodes are interconnected using high-speed networks like Gigabit Ethernet,
Myrinet, or InfiniBand, which enable efficient data transfer between nodes.
4. Middleware Support:
o Specialized middleware is necessary to implement Single-System Image (SSI) or
ensure High Availability (HA).
This combination of hardware, software, and middleware is crucial for building efficient,
scalable, and user-friendly HPC clusters that can handle complex computational tasks.
Unfortunately, a cluster-wide OS for complete resource sharing is not available yet. Middleware
or OS extensions were developed at the user space to achieve SSI at selected functional levels.
Without this middleware, cluster nodes cannot work together effectively to achieve cooperative
computing. The software environments and applications must rely on the middleware to achieve
high performance. The cluster benefits come from scalable performance, efficient message
passing, high system availability, seamless fault tolerance, and cluster-wide job management, as
summarized in Table 1.3.
In the past 30 years, users have experienced a natural growth path from Internet to web and grid
computing services. Internet services such as the Telnet command enables a local computer to
connect to a remote computer. A web service such as HTTP enables remote access of remote
web pages. Grid computing is envisioned to allow close interaction among applications running
on distant computers simultaneously. The evolution from Internet to web and grid services is
certainly playing a major role in this growth.
1.3.2.1ComputationalGrids