Introduction to Cloud Computing
Introduction to Cloud Computing
Carnegie Mellon
Lecture Motivation…
General overview on cloud computing
▪ What is cloud computing
Services
▪
▪ Types
▪ Advantages and disadvantages
▪ Enabling technologies
▪ An example infrastructure
Introduction to Cloud Computing
Carnegie Mellon
Lecture Outline
What is Cloud?
What is Cloud Computing?
Cloud Computing Services
History of Cloud Computing
Why Cloud Computing
Drawbacks of Cloud Computing
Types of Clouds
Introduction to Cloud Computing 3
Carnegie Mellon
A Cloud is …
Datacenter hardware and software
that the vendors use to offer the
computing resources and services
Introduction to Cloud Computing
Carnegie Mellon
Cloud Computing
Represents both the cloud & the provided services
Why call it “cloud computing”?
▪ Some say because the computing happens out there "in the
clouds"
Wikipedia: "the term derives from the fact that most
technology diagrams depict the Internet or IP availability by
using a drawing of a cloud."
Introduction to Cloud Computing
Carnegie Mellon
Cloud Computing
Who is Who…
Cloud providers Cloud Users & Service Providers Service Users
IBM Sun Cloud
SmugMug
Microsoft
Amazon VMware
AWS Aptana Amazon S3
Sun Cloud Animoto
3tera
Dell Sun Microsystems
Hewlett-Packard lland Qloud
Cloud Testing
Citrix Systems NaviSite
Outsourcery SynfiniWay IBM Amazon AWS
Red Hat
Intelliquib
“With Amazon [AWS], on Day One of launch we
could scale to the world.”
-Brad Jefferson, Co-Founder & CEO, Animoto
“Animoto has partnered with Amazon to leverage multiple Users use it to
offerings in their Web Services (AWS) platform which, in produce video
conjunction with Animoto's own render farm, constitutes the pieces from their
Animoto web infrastructure.” photos, video clips
and music.
Introduction to Cloud Computing
Carnegie Mellon
Cloud Computing Services
Three basic services:
Software as a Service (SAAS) model
▪ Apps through browser
Platform as a Service (PAAS) model
▪ Delivery of a computing platform for custom software
development as a service
Infrastructure as a Service (IAAS) model
▪ Deliver of computer infrastructure as a service
XAAS, the list continues to grow…
Introduction to Cloud Computing
Carnegie Mellon
Cloud Services ( XaaS )
Introduction to Cloud Computing
Carnegie Mellon
SaaS (1/3) SaaS
Started around 1999
Application is licensed to a customer as a
service on demand
Software Delivery Model:
▪ Hosted on the vendor’s web servers
▪ Downloaded at the consumer’s device and disabled
when on‐demand contract is over
Introduction to Cloud Computing
Carnegie Mellon
SaaS (2/3) SaaS
SaaS architecture/ Maturity levels:
▪ Distinguishing attributes: configurability, multi‐tenant efficiency, scalability
1 2 3 4
Tenant 1 Tenant 2 Tenant 1 Tenant 2 Tenant 1 Tenant 2 Tenant 1 Tenant 2
Tenant Load Balancer
instance 1 instance 2 instance instance instance
instance instance
Configurable + Multi-tenant-efficient + Scalable
• Each has its own • Same application but •(+):Efficient use of server
customized version of distinct instance/customer resources without apparent
the application and run differences to end users
its own instance • (-): scalability limits
Introduction to Cloud Computing
Carnegie Mellon
SaaS (3/3) SaaS
Examples
Introduction to Cloud Computing
Carnegie Mellon
PaaS (1/2) SaaS
PaaS
Delivery of an integrated computing platform (to
build/test/deploy custom apps) & solution stack as a
service.
Deploy your applications & don’t worry about buying &
managing the underlying hardware and software layers
Introduction to Cloud Computing
Carnegie Mellon
PaaS (2/2) SaaS
PaaS
Examples
Introduction to Cloud Computing
Carnegie Mellon
SaaS
IaaS (1/5)
PaaS
Delivery of computer infrastructure (typically IaaS
platform virtualization environment) as a service
Buy resources
▪ Servers
▪ Software
▪ Data center space
▪ Network equipment as fully outsourced services
Example:
Introduction to Cloud Computing
Carnegie Mellon
IaaS (2/5) SaaS
PaaS
Virtualization Technology is a major enabler of IaaS
IaaS
▪ It’s a path to share IT resource pools: Web servers, storage,
data, network, software and databases.
▪ Higher utilization rates
App1 App2 App3
Virtualized Stack
OS1 OS2 OS3
App1 App2 App3
Traditional
Middleware
Stack
OS
Hypervisor
Hardware
Hardware
Introduction to Cloud Computing
Carnegie Mellon
IaaS (3/5) SaaS
PaaS
Virtualization Technology is a major enabler of IaaS
IaaS
HARDWARE
Introduction to Cloud Computing
Carnegie Mellon
IaaS (4/5) SaaS
PaaS
Granularity of VMs
IaaS
▪ Multi‐core processors
VM VM
Quad Core:
VM VM
VM
Introduction to Cloud Computing
Carnegie Mellon
IaaS (5/5)
Capacity
Service Request Operations Dynamic
Monitoring Planning
Catalog UI UI Scheduling
SLA
Request Driven Provisioning & Service Management
Web 2.0 Data
Software Virtual High Volume
Collaborative Intensive
Development Classroom Transactions
Innovation Processing
Workloads
Virtual Virtual Virtual Virtual Virtual
Servers Storage Networks Applications & Clients
Virtualization Middleware
Servers Power Systems Racks, Storage Networking
BladeCenter
Physical Layer
Introduction to Cloud Computing
Carnegie Mellon
Resource sharing and consolidation
Offering computing resources as a service or
utility through:
▪ Virtualization
▪ Dynamic provisioning
Customizable Shared Resource:
User 1: User 2:
Introduction to Cloud Computing
Carnegie Mellon
Heterogeneous Physical Resources
Customizable Shared
Heterogeneous Resource:
User 1: User 2: User 3:
Introduction to Cloud Computing
Carnegie Mellon
More (XaaS): Everything as a Service EaaS
Desktop: DaaS
▪ Use your desktop virtually from anywhere
Communication: CaaS
Virtualization: VaaS
Hardware: HaaS
…etc
Introduction to Cloud Computing
Carnegie Mellon
Evolution
Discussed in lecture1
Introduction to Cloud Computing
Carnegie Mellon
Enabling Technologies
Virtualization
Web 2.0
Distributed Storage
Distributed Computing
Utility Computing
Network Bandwidth & Latency
Fault‐Tolerant Systems
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Large‐Scale Data‐Intensive Applications
Flexibility
Scalability
Customized to your current needs:
▪ Hardware
▪ Software
Effect:
▪ Reduce Cost
▪ Reduce Maintenance
▪ High Utilization
▪ High Availability
▪ Reduced Carbon Footprint
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Flexibility
▪ Software: Any software platform
▪ Access: access resources from any machine
connected to the Internet
▪ Deploy infrastructure from anywhere at anytime
▪ Software controls infrastructure
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Scalability
▪ Instant
▪ Control via software
Add/cancel/rebuild resources instantly
▪
▪ Start small, then scale your resources up/down
as you need
▪ illusion of infinite resources available on
demand
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Customization
▪ Everything in your wish list
▪ Software platforms
▪ Storage
▪ Network bandwidth
▪ Speed
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Cost
▪ Pay‐as‐you‐go model
▪ Small/medium size companies can tap the
infrastructure of corporate giants.
▪ Time to service/market
▪ No upfront cost
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Maintenance
▪ Reduce the size of a client’s IT department
▪ Is the responsibility of the cloud vendor
▪ This Includes:
▪ Software updates
▪ Security patches
▪ Monitoring system’s health
▪ System backup
▪ …etc
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Utilization
▪ Consolidation of a large number of resources
▪ CPU cycles
▪ Storage
▪ Network Bandwidth
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
Availability
▪ Having access to software, platform, infrastructure
from anywhere at any time
▪ All you need is a device connected to the internet
Reliability
The system’s fault tolerance is managed by the cloud
providers and users no longer need to worry about it.
Introduction to Cloud Computing
Carnegie Mellon
Why Cloud Computing?
CO2 Footprint
▪ Consolidation of servers
▪ Higher utilization
▪ Reduced power usage
Introduction to Cloud Computing
Carnegie Mellon
Drawbacks
Security
Privacy
Vendor lock‐in
Network‐dependent
Migration
Introduction to Cloud Computing
Carnegie Mellon
Types of Clouds (1/4)
Public
Private
Hybrid
Introduction to Cloud Computing
Carnegie Mellon
Types of Clouds (2/4)
Public (external) cloud
▪ Open Market for on demand computing and IT resources
▪ Concerns: Limited SLA, Reliability, Availability, Security, Trust and
Confidence
▪ Examples: IBM, Google, Amazon, …
Introduction to Cloud Computing
Carnegie Mellon
Types of Clouds (3/4)
Private (Internal) cloud
▪ For Enterprises/Corporations with large scale IT
Introduction to Cloud Computing
Carnegie Mellon
Types of Clouds (4/4)
Hybrid cloud
▪ Extend the Private Cloud(s) by connecting it to other external cloud
vendors to make use of available cloud services from external
vendors
Cloud Burst
▪ Use the local cloud, when you
need more resources, burst
into the public cloud
Introduction to Cloud Computing
Carnegie Mellon
Types of Applications
Open discussion
Introduction to Cloud Computing
Carnegie Mellon
System Infrastructure
Large‐scale Data‐centric applications
Exploit parallelism
Easy to manage
Elastic (dynamic?)
Fault‐tolerant
Introduction to Cloud Computing
Carnegie Mellon
MapReduce and Apache Hadoop
MapReduce: Abstraction that simplifies writing
applications that access massively distributed data
Hadoop: Open source MapReduce software platform
Distributes data and processing across many nodes
Processes the data locally at each node
Transparent fault tolerance through
▪ Automatic data duplication
▪ Automatic detection and restarting of failing nodes
Introduction to Cloud Computing
Carnegie Mellon
MapReduce Programming Model
Functional programming that is easily
parallelizable
Split into two phases:
▪ Map – Perform custom function on all
items in an array
▪ Reduce – Collate map results using custom
function
Scales well – computation separated
from processing dataflow
Illustrative example:
▪ Map that squares the value of numbers in
an array
{1, 2, 3, 4} ‐> {1, 4, 9, 16}
▪ Reduce that sums the squares : 30
Introduction to Cloud Computing
Carnegie Mellon
Hadoop Map/Reduce
The Map‐Reduce programming model
▪ Framework for distributed processing of large data sets
▪ Pluggable user code runs in generic framework
Example:
▪ cat * | grep | sort | unique ‐c | cat > file
▪ input | map | shuffle | reduce | output
Natural for unstructured data:
▪ Log processing
▪ Web search indexing
▪ Ad‐hoc queries
Introduction to Cloud Computing
Carnegie Mellon
Apache Hadoop
Open source MapReduce software platform
Automatically provides framework for developing
MapReduce applications
▪ Handles mapping and reducing logistics
▪ Programmer just provides custom functionality
Currently takes custom functionality in Java and Python
Uses an open source Eclipse plug‐in to interface with Hadoop
Introduction to Cloud Computing
Carnegie Mellon
HDFS
Very Large Distributed File System
▪ 10K nodes, 100 million files, 10 PB
Assumes Commodity Hardware
▪ Files are replication in order to handle hardware failure
▪ System detects failures and recovers from them
Optimized for Batch Processing
▪ Data locations exposed so that computations can move to where
data resides
▪ Provides very high aggregate bandwidth
Introduction to Cloud Computing
Carnegie Mellon
Distributed File System NameNode
Single Namespace for entire cluster
Data Coherency
▪ Write‐once‐read‐many access model Client
▪ Client can only append to existing files
Files are broken up into blocks
▪ Typically 128 MB block size
▪ Each block replicated on multiple DataNodes
Intelligent Client DataNodes
▪ Client can find location of blocks
▪ Client accesses data directly from DataNode
Introduction to Cloud Computing