lOMoARcPSD|55681700
CC UNIT-1
Cloud Computing (Dr. A.P.J. Abdul Kalam Technical University)
Scan to open on Studocu
Studocu is not sponsored or endorsed by any college or university
Downloaded by Ayush (
[email protected])
lOMoARcPSD|55681700
LECTURE NOTES ON
(KOE-081)(CLOUD COMPUTING)
(B.Tech)(EC)(YEAR-4TH)(SEM-8TH)
(AKTU)
Mr. ABHISHEK MISHRA
ASSISTANT PROFESSOR
DEPARTMENT OF ELECTRONICS&COMMUNICATION
ENGINEERING
UNITED INSTITUTE OF TECHNOLOGY, PRAYAGRAJ
lOMoARcPSD|55681700
UNIT-1
(INTRODUCTION OF CLOUD COMPUTING)
Introduction:
Cloud Computing provides us a means by which we can access the applications as utilities, over
the Internet. It allows us to create, configure, and customize applications online.
With Cloud Computing users can access database resources via the internet from anywhere for
as long as they need without worrying about any maintenance or management of actual
resources.
What is Cloud?
The term Cloud refers to a Network or Internet. In other words, we can say that Cloud is
something, which is present at remote location.
Cloud can provide services over network, i.e., on public networks or on private networks, i.e.,
WAN, LAN or VPN.
Applications such as e-mail, web conferencing, customer relationship management (CRM), all
run in cloud.
Cloud Computing Architecture:
lOMoARcPSD|55681700
Basic Concepts:
There are certain services and models working behind the scene making the cloud computing
feasible and accessible to end users. Following are the working models for cloud computing:
1. Deployment Models
2. Service Models
Deployment Models:
Deployment models define the type of access to the cloud, i.e., how the cloud is located? Cloud
can have any of the four types of access: Public, Private, Hybrid and Community.
lOMoARcPSD|55681700
Public cloud: The Public Cloud allows systems and services to be easily accessible to the
general public. Public cloud may be less secure because of its openness, e.g., e-mail.
Private cloud: The Private Cloud allows systems and services to be accessible within an
organization. It offers increased security because of its private nature.
Community cloud: The Community Cloud allows systems and services to be accessible by
group of organizations.
Hybrid cloud: The Hybrid Cloud is mixture of public and private cloud. However, the critical
activities are performed using private cloud while the non-critical activities are performed
using public cloud.
Service Models:
Service Models are the reference models on which the Cloud Computing is based. These can be
categorized into three basic service models as listed below:
1. Infrastructure as a Service (IaaS)
2. Platform as a Service (PaaS)
3. Software as a Service (SaaS)
Infrastructure as a Service (IaaS):
IaaS is the delivery of technology infrastructure as an on demand scalable service.
IaaS provides access to fundamental resources such as physical machines, virtual machines,
virtual storage, etc.
• Usually billed based on usage
• Usually multi tenant virtualized environment
• Can be coupled with Managed Services for OS and application support
lOMoARcPSD|55681700
IaaS Examples:
Platform as a Service (PaaS):
PaaS provides the runtime environment for applications, development & deployment tools,
etc.
PaaS provides all of the facilities required to support the complete life cycle of building and
delivering web applications and services entirely from the Internet.
Typically applications must be developed with a particular platform in mind
• Multi tenant environments
• Highly scalable multi tier architecture
lOMoARcPSD|55681700
PaaS Examples:
Software as a Service (SaaS):
SaaS model allows to use software applications as a service to end users.
SaaS is a software delivery methodology that provides licensed multi-tenant access to software
and its functions remotely as a Web-based service.
1-Usually billed based on usage
2-Usually multi tenant environment
3-Highly scalable architecture
lOMoARcPSD|55681700
SaaS Examples:
Advantages:
1-Lower computer costs
2-Improved performance:
3-Reduced software costs
4-Instant software updates
5-Improved document format compatibility
6- Unlimited storage capacity
7-Increased data reliability
8-Universal document access
lOMoARcPSD|55681700
9-Latest version availability
10-Easier group collaboration
11-Device independence
Disadvantages:
1-Requires a constant Internet connection
2-Does not work well with low-speed connections
3-Features might be limited
4-Can be slow
5-Stored data can be lost
6-Stored data might not be secure
Cloud Storage:
lOMoARcPSD|55681700
1-Create an Account User name and password.
2-Content lives with the account in the cloud.
3-Log onto any computer with Wi-Fi to find your content.
Download For Storage:
1-Download a cloud based app to on your computer
2-The app lives on your Computer Save files to the app
3-When connected to the Internet it will sync with the cloud
4-The Cloud can be accessed from any Internet connection
Introduction of Parallel Computing
What is Parallel Computing?
Traditionally, software has been written for serial computation
To be run on a single computer having a single Central Processing Unit
A problem is broken into a discrete series of instructions
Instructions are executed one after another
Only one instruction may execute at any moment in time
lOMoARcPSD|55681700
Parallel computing is the simultaneous use of multiple compute resources to solve a
computational problem
Accomplished by breaking the problem into independent parts so that each
processing element can execute its part of the algorithm simultaneously with
the others
lOMoARcPSD|55681700
The computational problem should be:
Solved in less time with multiple compute resources than with a
single compute resource
The compute resources might be
A single computer with multiple processors
Several networked computers
A combination of both
LLN Lab Parallel Computer:
Each compute node is a multi-processor parallel computer
Multiple compute nodes are networked together with an
InfiniBand network
lOMoARcPSD|55681700
The Real World is Massively Parallel
In the natural world, many complex, interrelated events are happening at the
same time, yet within a temporal sequence
Compared to serial computing, parallel computing is much better suited for
modeling, simulating and understanding complex, real world phenomena.
lOMoARcPSD|55681700
Uses for Parallel Computing:
Science and Engineering
Historically, parallel computing has been used to model difficult problems in
many areas of science and engineering
• Atmosphere, Earth, Environment, Physics, Bioscience, Chemistry,
Mechanical Engineering, Electrical Engineering, Circuit Design,
Microelectronics, Defense, Weapons
Industrial and Commercial
Today, commercial applications provide an equal or greater driving force in the
development of faster computers
• Data mining, Oil exploration, Web search engines, Medical imaging and
diagnosis, Pharmaceutical design, Financial and economic modeling,
Advanced graphics and virtual reality, Collaborative work environments
lOMoARcPSD|55681700
Save time and money
In theory, throwing more resources at a task will shorten its time to completion,
with potential cost savings
Parallel computers can be built from cheap, commodity components
Solve larger problems
Many problems are so large and complex that it is impractical or impossible to
solve them on a single computer, especially given limited computer memory
• en.wikipedia.org/wiki/Grand_Challenge problems requiring PetaFLOPS
and PetaBytes of computing resources.
Web search engines and databases processing millions of transactions per
second
lOMoARcPSD|55681700
Provide concurrency
A single compute resource can only do one thing at a time. Multiple computing
resources can be doing many things simultaneously
• For example, the Access Grid (www.accessgrid.org) provides a global
collaboration network where people from around the world can meet
and conduct work "virtually”
lOMoARcPSD|55681700
Use of non-local resources
Using compute resources on a wide area network, or even the Internet when
local compute resources are insufficient
• SETI@home (setiathome.berkeley.edu) over 1.3 million users, 3.4 million
computers in nearly every country in the world.
Source: www.boincsynergy.com/stats/ (June, 2013).
• Folding@home (folding.stanford.edu) uses over 320,000 computers
globally (June, 2013)
Limits to serial computing
Transmission speeds
• The speed of a serial computer is directly dependent upon how fast data
can move through hardware.
• Absolute limits are the speed of light and the transmission limit of copper
wire
• Increasing speeds necessitate increasing proximity of processing
elements
Limits to miniaturization
• Processor technology is allowing an increasing number of transistors to
be placed on a chip
• However, even with molecular or atomic-level components, a limit will
be reached on how small components can be
lOMoARcPSD|55681700
Economic limitations
• It is increasingly expensive to make a single processor faster
• Using a larger number of moderately fast commodity processors to
achieve the same or better performance is less expensive
Current computer architectures are increasingly relying upon hardware level
parallelism to improve performance
• Multiple execution units
• Pipelined instructions
• Multi-core
The Future:
Trends indicated by ever faster networks, distributed systems, and multi-processor computer
architectures clearly show that parallelism is the future of computing.
There has been a greater than 1000x increase in supercomputer performance, with no end
currently in sight.
lOMoARcPSD|55681700
von Neumann Architecture:
Named after the Hungarian mathematician John von Neumann who first authored the
general requirements for an electronic computer in his 1945 papers
Since then, virtually all computers have followed this basic design
Comprises of four main components:
RAM is used to store both program instructions and data
Control unit fetches instructions or data from memory,
Decodes instructions and then sequentially coordinates operations to accomplish the
programmed task.
Arithmetic Unit performs basic arithmetic operations
Input-Output is the interface to the human operator
lOMoARcPSD|55681700
Flynn's Classical Taxonomy:
Different ways to classify parallel computers
Examples available HERE
Widely used classifications is called Flynn's Taxonomy
Based upon the number of concurrent instruction and data streams available in
the architecture
Single Instruction, Single Data
A sequential computer which exploits no parallelism in either the instruction or
data streams
• Single Instruction: Only one instruction stream is being acted on by the
CPU during any one clock cycle
• Single Data: Only one data stream is being used as input during any one
clock cycle
• Can have concurrent processing characteristics: Pipelined execution
lOMoARcPSD|55681700
Older generation mainframes, minicomputers, workstations, modern day PCs:
lOMoARcPSD|55681700
Single Instruction, Multiple Data
A computer which exploits multiple data streams against a single instruction
stream to perform operations
• Single Instruction: All processing units execute the same instruction at
any given clock cycle
• Multiple Data: Each processing unit can operate on a different data
element
• Single Instruction, Multiple Data
• Vector Processor implements a instruction set containing instructions that
operates on 1D arrays of data called vectors
lOMoARcPSD|55681700
Multiple Instruction, Single Data
Multiple Instruction: Each processing unit operates on the data independently
via separate instruction streams
Single Data: A single data stream is fed into multiple processing units
Some conceivable uses might be:
• Multiple cryptography algorithms attempting to crack a single coded
message
Multiple Instruction, Multiple Data
Multiple autonomous processors simultaneously executing different instructions
on different data
• Multiple Instruction: Every processor may be executing a different
instruction stream
• Multiple Data: Every processor may be working with a different data
stream
lOMoARcPSD|55681700
Most current supercomputers, networked parallel computer clusters, grids, clouds, multi-core
PCs
Many MIMD architectures also include SIMD execution sub-components.
lOMoARcPSD|55681700
Some General Parallel Terminology
Parallel computing
Using parallel computer to solve single problems faster
Parallel computer
Multiple-processor or multi-core system supporting parallel programming
Parallel programming
Programming in a language that supports concurrency explicitly
Supercomputing or High Performance Computing
Using the world's fastest and largest computers to solve large problems
Task
A logically discrete section of computational work.
A task is typically a program or program-like set of instructions that is executed
by a processor
A parallel program consists of multiple tasks running on multiple processors
Shared Memory
Hardware point of view, describes a computer architecture where all processors
have direct access to common physical memory
In a programming sense, it describes a model where all parallel tasks have the
same "picture" of memory and can directly address and access the same logical
memory locations regardless of where the physical memory actually exists.
Symmetric Multi-Processor (SMP)
Hardware architecture where multiple processors share a single address space
and access to all resources; shared memory computing.
lOMoARcPSD|55681700
Distributed Memory
In hardware, refers to network based memory access for physical memory that
is not common
As a programming model, tasks can only logically "see" local machine memory
and must use communications to access memory on other machines where
other tasks are executing
Communications
Parallel tasks typically need to exchange data. There are several ways this can be
accomplished, such as through a shared memory bus or over a network,
however the actual event of data exchange is commonly referred to as
communications regardless of the method employed.
Synchronization
Coordination of parallel tasks in real time, very often associated with
communications
Often implemented by establishing a synchronization point within an application
where a task may not proceed further until another task(s) reaches the same or
logically equivalent point.
Synchronization usually involves waiting by at least one task, and can therefore
cause a parallel application's wall clock execution time to increase.
Parallel Overhead
The amount of time required to coordinate parallel tasks, as opposed to doing
useful work.
Parallel overhead can include factors such as:
Task start-up time
Synchronizations
Data communications
lOMoARcPSD|55681700
Software overhead imposed by parallel languages, libraries, operating
system, etc.
Task termination time
Massively Parallel
Refers to the hardware that comprises a given parallel system - having many
processors.
The meaning of "many" keeps increasing, but currently, the largest parallel
computers can be comprised of processors numbering in the hundreds of
thousands.
Embarrassingly Parallel
Solving many similar, but independent tasks simultaneously; little to no need for
coordination between the tasks.
Scalability
A parallel system's ability to demonstrate a proportionate increase in parallel
speedup with the addition of more resources.
Multi-core processor
A single computing component with two or more independent actual central
processing units.
lOMoARcPSD|55681700
Limits and Costs of Parallel Programming
Potential program speedup is defined by the fraction of code (P) that can be
parallelized.
If none of the code can be parallelized, P = 0 and the speedup = 1 (no speedup).
If 50% of the code can be parallelized,
maximum speedup = 2, meaning the code will run twice as fast.
Introducing the number of processors N performing the parallel fraction of work P, the
relationship can be modeled by:
It soon becomes obvious that there are limits to the scalability of parallelism. For
example
lOMoARcPSD|55681700
Some General Parallel Terminology
Complexity
Parallel applications are much more complex than corresponding serial
applications
• Not only do you have multiple instruction streams executing at the same
time, but you also have data flowing between them
The costs of complexity are measured in programmer time in virtually every
aspect of the software development cycle:
• Design
• Coding
• Debugging
lOMoARcPSD|55681700
• Tuning
• Maintenance
Portability
Portability issues with parallel programs are not as serious as in years past due
to standardization in several APIs, such as MPI, POSIX threads, and Open MP
All of the usual portability issues associated with serial programs apply to
parallel programs
• Hardware architectures are characteristically highly variable and can
affect portability
Resource Requirements
Amount of memory required can be greater for parallel codes than serial codes,
due to the need to replicate data and for overheads associated with parallel
support libraries and subsystems
For short running parallel programs, there can actually be a decrease in
performance compared to a similar serial implementation
• The overhead costs associated with setting up the parallel environment,
task creation, communications and task termination can comprise a
significant portion of the total execution time for short runs
Scalability
Ability of a parallel program's performance to scale is a result of a number of
interrelated factors. Simply adding more processors is rarely the answer
Algorithm may have inherent limits to scalability
Hardware factors play a significant role in scalability. Examples:
• Memory-CPU bus bandwidth on an SMP machine
• Communications network bandwidth
• Amount of memory available on any given machine or set of machines
• Processor clock speed
lOMoARcPSD|55681700
Introduction of Distributed System
A collection of independent computers that appear to the users of the system as a single
computer
A collection of autonomous computers, connected through a network and distribution
middleware which enables computers to coordinate their activities and to share the resources
of the system, so that users perceive the system as a single, integrated computing facility.
Example – Internet
lOMoARcPSD|55681700
Example - ATM
Example - Mobile devices in a Distributed System
lOMoARcPSD|55681700
Why Distributed Systems?
Transparency
Scalability
Fault tolerance
Concurrency
Openness
These challenges can also be seen as the goals or desired properties of a distributed
system
Transparency
Concealment from the user and the application programmer of the separation of the
components of a distributed system
Access Transparency - Local and remote resources are accessed in same way
Location Transparency - Users are unaware of the location of resources
Migration Transparency - Resources can migrate without name change
Replication Transparency - Users are unaware of the existence of multiple
copies of resources
lOMoARcPSD|55681700
Failure Transparency - Users are unaware of the failure of individual components
Concurrency Transparency - Users are unaware of sharing resources with others
Scalability
Addition of users and resources without suffering a noticeable loss of performance or
increase in administrative complexity.
Adding users and resources causes a system to grow:
Size - growth with regards to the number of users or resources
System may become overloaded
May increase administration cost
Geography - growth with regards to geography or the distance between nodes
Greater communication delays
Administration – increase in administrative cost
Openness
Whether the system can be extended in various ways without disrupting existing
system and services
Hardware extensions
• adding peripherals, memory, communication interfaces
Software extensions
• Operating System features
• Communication protocols
Openness is supported by:
Public interfaces and Standardized communication protocols
Concurrency
In a single system several processes are interleaved
lOMoARcPSD|55681700
In distributed systems - there are many systems with one or more processors
Many users simultaneously invoke commands or applications, access and update
shared data
Mutual exclusion
Synchronization and No global clock
Fault tolerance
Hardware, software and networks fail
Distributed systems must maintain availability even at low levels of hardware, software,
network reliability
Fault tolerance is achieved by
Recovery
Redundancy
Issues
Detecting failures
Masking failures
Recovery from failures
Redundancy
Omission and Arbitrary Failures
lOMoARcPSD|55681700
Design Requirements
Performance issues
Responsiveness
Throughput
Load sharing, load balancing
Quality of service
Correctness
Reliability, availability, fault tolerance
Security
Performance
Adaptability