Cloud Computing Course Overview
Cloud Computing Course Overview
2
Cloud Computing Course - Overview
I. Introduction to Cloud Computing
i. Overview of Computing
ii. Cloud Computing (NIST Model)
iii. Properties, Characteristics & Disadvantages
iv. Role of Open Standards
II. Cloud Computing Architecture
i. Cloud computing stack
ii. Service Models (XaaS)
a. Infrastructure as a Service(IaaS)
b. Platform as a Service(PaaS)
c. Software as a Service(SaaS)
iii. Deployment Models
III. Service Management in Cloud Computing
i. Service Level Agreements(SLAs)
ii. Cloud Economics
IV. Resource Management in Cloud Computing
Cloud Computing Course (contd.)
V. Data Management in Cloud Computing
i. Looking at Data, Scalability & Cloud Services
ii. Database & Data Stores in Cloud
iii. Large Scale Data Processing
VI. Cloud Security
i. Infrastructure Security
ii. Data security and Storage
iii. Identity and Access Management
iv. Access Control, Trust, Reputation, Risk
VII. Case Study on Open Source and Commercial Clouds, Cloud Simulator
5
Distributed Computing
6
Centralized vs. Distributed Computing
7
Distributed Computing/System?
• Distributed computing
– Field of computing science that studies distributed system.
– Use of distributed systems to solve computational problems.
• Distributed system
– Wikipedia
• There are several autonomous computational entities,
each of which has its own local memory.
• The entities communicate with each other by message
passing.
– Operating System Concept
• The processors communicate with one another through various
communication lines, such as high-speed buses or telephone
lines.
• Each processor has its own local memory.
8
Example Distributed Systems
• Internet
• ATM (bank) machines
• Intranets/Workgroups
• Computing landscape will soon consist of ubiquitous
network-connected devices
9
Computers in a Distributed System
• Workstations: Computers used by end-users to perform
computing
• Server Systems: Computers which provide resources and
services
• Personal Assistance Devices: Handheld computers connected to
the system via a wireless communication link.
10
Common properties of Distributed Computing
– Fault tolerance
• When one or some nodes fails, the whole system can still work fine except performance.
• Need to check the status of each node
– Each node play partial role
• Each computer has only a limited, incomplete view of the system.
• Each computer may know only one part of the input.
– Resource sharing
• Each user can share the computing power and storage resource in the system with other
users
– Load Sharing
• Dispatching several tasks to each nodes can help share loading to the whole system.
– Easy to expand
• We expect to use few time when adding nodes. Hope to spend no time if possible.
– Performance
• Parallel computing can be considered a subset of distributed computing
11
Why Distributed Computing?
• Nature of application
• Performance
– Computing intensive
• The task could consume a lot of time on computing. For example,
Computation of Pi value using Monte Carlo simulation
– Data intensive
• The task that deals with a large amount or large size of files. For example,
Facebook, LHC(Large Hadron Collider) experimental data processing.
• Robustness
– No SPOF (Single Point Of Failure)
– Other nodes can execute the same task executed on failed
node.
12
13
CLOUD COMPUTING
CLOUD COMPUTING OVERVIEW (contd..)
2
Distributed applications
• Applications that consist of a set of processes that are distributed across a
network of machines and work together as an ensemble to solve a
common problem
3
Clients invoke individual servers
4
A typical distributed application based on peer processes
5
Grid Computing
6
Grid Computing?
• Pcwebopedia.com
– A form of networking. unlike conventional networks that focus on communication
among devices, grid computing harnesses unused processing cycles of all computers in
a network for solving problems too intensive for any stand-alone machine.
• IBM
– Grid computing enables the virtualization of distributed computing and data resources
such as processing, network bandwidth and storage capacity to create a single system
image, granting users and applications seamless access to vast IT capabilities. Just as
an Internet user views a unified instance of content via the Web, a grid user essentially
sees a single, large virtual computer.
• Sun Microsystems
– Grid Computing is a computing infrastructure that provides dependable,
consistent, pervasive and inexpensive access to computational capabilities
7
Electrical Power Grid Analogy
Electrical Power Grid Grid
• Users (or electrical appliances) get access to • Users (or client applications) gain access to computing
electricity through wall sockets with no care or resources (processors, storage, data, applications, and
so on) as needed with little or no knowledge of where
consideration for where or how the electricity those resources are located or what the underlying
is actually generated. technologies, hardware, operating system, and so on
are
• “The power grid” links together power plants • “The Grid" links together computing resources (PCs,
of many different kinds workstations, servers, storage elements) and provides
the mechanism needed to access them.
8
Grid Computing
When v use
1. Share more than information: Data, computing power, applications in
dynamic environment, multi-institutional, virtual organizations
2. Efficient use of resources at many institutes. People from many institutions
working to solve a common problem (virtual organisation).
3. Join local communities.
4. Interactions with the underneath layers must be transparent and seamless
to the user.
9
Need of Grid Computing?
• Today’s Science/Research is based on computations, data analysis, data
visualization & collaborations
• Computer Simulations & Modelling are more cost effective than
experimental methods Mathematical modeling of systems
• Scientific and Engineering problems are becoming more complex & users
need more accurate, precise solutions to their problems in shortest possible
time
• Data Visualization is becoming very important
• Exploiting under utilized resources
10
Who uses Grid Computing ?
11
Type of Grids
• Computational Grid: These grids provide secure access to huge pool of shared processing
power suitable for high throughput applications and computation intensive computing.
• Data Grid: Data grids provide an infrastructure to support data storage, data discovery, data
handling, data publication, and data manipulation of large volumes of data actually stored in
various heterogeneous databases and file systems.
• Collaboration Grid: With the advent of Internet, there has been an increased demand for
better collaboration. Such advanced collaboration is possible using the grid. For instance,
persons from different companies in a virtual enterprise can work on different components of
a CAD project without even disclosing their proprietary technologies.
12
Type of Grids
• Network Grid: A Network Grid provides fault-tolerant and high-performance communication
services. Each grid node works as a data router between two communication points,
providing data-caching and other facilities to speed up the communications between such
points.
• Utility Grid: This is the ultimate form of the Grid, in which not only data and computation
cycles are shared but software or just about any resource is shared. The main services
provided through utility grids are software and special equipment. For instance, the
applications can be run on one machine and all the users can send their data to be processed
to that machine and receive the result back.
13
Grid Components
Source: Kajari Mazumdar “GRID: Computing Without Borders” Department of High Energy Physics TIFR, Mumbai. 14
Cluster Computing
15
What is Cluster Computing?
• A cluster is a type of parallel or distributed computer
system, which consists of a collection of inter-connected
stand-alone computers working together as a single
integrated computing resource .
• Key components of a cluster include multiple standalone
computers (PCs, Workstations, or SMPs), operating systems,
high-performance interconnects, middleware, parallel
programming environments, and applications.
16
Cluster Computing?
• Clusters are usually deployed to improve speed and/or reliability
over that provided by a single computer, while typically being much
more cost effective than single computer the of comparable speed
or reliability
• In a typical cluster:
– Network: Faster, closer connection than a typical
network (LAN)
– Low latency communication protocols
– Loosely coupled than SMP
17
Types of Cluster
18
Cluster Components
• Basic building blocks of clusters are broken down into multiple
categories:
• Cluster Nodes
• Cluster Network
• Network Characterization
19
Key Operational Benefits of Clustering
• System availability: offer inherent high system availability due to
the redundancy of hardware, operating systems, and applications.
• Hardware fault tolerance: redundancy for most system
components (eg. disk-RAID), including both hardware and
software.
• OS and application reliability: run multiple copies of the OS and
applications, and through this redundancy
• Scalability. adding servers to the cluster or by adding more clusters
to the network as the need arises or CPU to SMP.
• High performance: (running cluster enabled programs)
20
Utility Computing
21
“Utility” Computing ?
• Utility Computing is purely a concept which cloud computing practically implements.
• This model has the advantage of a low or no initial cost to acquire computer resources;
instead, computational resources are essentially rented.
• The word utility is used to make an analogy to other services, such as electrical power,
that seek to meet fluctuating customer needs, and charge for the resources based on
usage rather than on a flat-rate basis. This approach, sometimes known as pay-per-use
22
“Utility” Computing ?
• "Utility computing" has
usually envisioned some
form of virtualization so
that the amount of
storage or computing
power available is
considerably larger than
that of a single time-
sharing computer.
24
Utility Computing Example
On-Demand Cyber Infrastructure
25
Utility Solution – Your Perspective
Consumer vs Provider
Service Infrastructure
Procurement Procurement
Service Equipment
Assurance Pricing Maintenance
Resource
Availability Utilization
Technology
Contractor Security SLA Refresh
Management System
Consultants Application Admins
Sizing Capacity
Planning
Source: Perry Boster, “Utility Computing for Shared Services”,
Massachusetts Digital Government Summit, September 23rd, 2004 –
Boston, MA
26
Utility Computing Payment Models
• Same range of charging models as other utility providers: gas, electricity, telecommunications, water,
television broadcasting
− Flat rate
− Tiered
− Subscription
− Metered
− Pay as you go
− Standing charges
• Different pricing models for different customers based on factors such as scale, commitment and
payment frequency
• But the principle of utility computing remains
• The pricing model is simply an expression by the provider of the costs of provision of the resources and a
profit margin
27
Risks in a UC World
• Data Backup
• Data Security
• Partner Competency
• Defining SLA
• Getting value from charge back
28
Cloud Computing
29
Cloud Computing
US National Institute of Standards and Technology defines Computing as
“ Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of
configurable computing resources (e.g networks, servers, storage, applications, and services) that can be
rapidly provisioned and released with minimal management effort or service provider interaction. ”
30
Source: http://www.smallbiztechnology.com/archive/2011/09/wait-what-is-cloud-computing.html/
31
Cloud Computing - Overview
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
Cloud Computing
2
Cloud Computing
US National Institute of Standards and Technology (NIST) defines Computing as:
“ Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a
shared pool of configurable computing resources (e.g. networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with minimal management effort or
service provider interaction. ”
http://www.smallbiztechnology.com/archive/2011/09/wait-what-is-cloud-computing.html
3
Essential Characteristics
• On-demand self-service
• A consumer can unilaterally provision computing capabilities, such as server time and network storage, as
needed automatically without requiring human interaction with each service provider.
• Resource pooling
• The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model,
with different physical and virtual resources dynamically assigned and reassigned according to consumer
demand.
4
Cloud Characteristics
• Measured Service
– Cloud systems automatically control and optimize resource use by leveraging a metering
capability at some level of abstraction appropriate to the type of service (e.g., storage,
processing, bandwidth, and active user accounts). Resource usage can be
– monitored, controlled, and reported, providing transparency for both the provider and
consumer of the utilized service.
• Rapid elasticity
– Capabilities can be elastically provisioned and released, in some cases automatically, to
scale rapidly outward and inward commensurate with demand. To the consumer, the
capabilities available for provisioning often appear to be unlimited and can be
appropriated in any quantity at any time.
5
Common Characteristics
• Massive Scale
• Resilient Computing
• Homogeneity
• Geographic Distribution
• Virtualization
• Service Orientation
• Low Cost Software
• Advanced Security
6
Cloud Services Models
• Software as a Service (SaaS)
The capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The applications
are accessible from various client devices through either a thin client interface, such as a web browser (e.g., web-based email),
or a program interface.
The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems,
storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration
settings.
7
Cloud Services Models
Platform as a Service (PaaS)
The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or
acquired applications created using programming languages, libraries, services, and tools supported by the
provider.
The consumer does not manage or control the underlying cloud infrastructure including network, servers,
operating systems, or storage, but has control over the deployed applications and possibly configuration
settings for the application-hosting environment.
8
Cloud Services Models
9
Types of Cloud (Deployment Models)
• Private cloud
The cloud infrastructure is operated solely for an organization.
e.g Window Server 'Hyper-V'.
• Community cloud
The cloud infrastructure is shared by several organizations and supports a specific goal.
• Public cloud
The cloud infrastructure is made available to the general public
e.g Google Doc, Spreadsheet,
• Hybrid cloud
The cloud infrastructure is a composition of two or more clouds (private, community, or public)
e.g Cloud Bursting for load balancing between clouds.
10
Cloud and Virtualization
• Virtual Workspaces:
– An abstraction of an execution environment that can be made dynamically available to
authorized clients by using well-defined protocols,
– Resource quota (e.g. CPU, memory share),
– Software configuration (e.g. OS).
• Implement on Virtual Machines (VMs):
– Abstraction of a physical host machine,
– Hypervisor intercepts and emulates instructions from VMs, and allows management of VMs,
– VMWare, Xen, KVM etc. App App App
OS OS OS
• Provide infrastructure API:
Hypervisor
– Plug-ins to hardware/support structures
Hardware
Virtualized Stack
11
Virtual Machines
• VM technology allows multiple virtual machines to run on a single
physical machine.
Hardware
• Performance: Para-virtualization (e.g. Xen) is very close to raw physical
performance!
12
Virtualization in General
• Advantages of virtual machines:
– Run operating systems where the physical hardware is unavailable,
– Easier to create new machines, backup machines, etc.,
– Software testing using “clean” installs of operating systems and software,
– Emulate more machines than are physically available,
– Timeshare lightly loaded systems on one host,
– Debug problems (suspend and resume the problem machine),
– Easy migration of virtual machines (shutdown needed or not).
– Run legacy systems
13
Cloud-Sourcing
• Why is it becoming important ?
– Using high-scale/low-cost providers,
– Any time/place access via web browser,
– Rapid scalability; incremental cost and load sharing,
– Can forget need to focus on local IT.
• Concerns:
– Performance, reliability, and SLAs,
– Control of data, and service parameters,
– Application features and choices,
– Interaction between Cloud providers,
– No standard API – mix of SOAP and REST!
– Privacy, security, compliance, trust…
14
Cloud Storage
• Several large Web companies are now exploiting the fact that they have data
storage capacity that can be hired out to others.
– Allows data stored remotely to be temporarily cached on
desktop computers, mobile phones or other Internet-linked
devices.
• Amazon’s Elastic Compute Cloud (EC2) and Simple Storage Solution (S3) are well
known examples
15
Advantages of Cloud Computing
• Lower computer costs:
– No need of a high-powered and high-priced computer to run cloud computing's
web-based applications.
– Since applications run in the cloud, not on the desktop PC, your desktop PC does not
need the processing power or hard disk space demanded by traditional desktop
software.
– When you are using web-based applications, your PC can be less expensive, with a
smaller hard disk, less memory, more efficient processor...
– In fact, your PC in this scenario does not even need a CD or DVD drive, as no
software programs have to be loaded and no document files need to be saved.
16
Advantages of Cloud Computing
• Improved performance:
– With few large programs hogging your computer's memory, you will see better
performance from your PC.
– Computers in a cloud computing system boot and run faster because they have fewer
programs and processes loaded into memory.
• Reduced software costs:
– Instead of purchasing expensive software applications, you can get most of what you
need for free.
• most cloud computing applications today, such as the Google Docs suite .
– better than paying for similar commercial software
• which alone may be justification for switching to cloud applications.
17
Advantages of Cloud Computing
• Instant software updates
– Another advantage to cloud computing is that you are no longer faced with choosing
between obsolete software and high upgrade costs.
– When the application is web-based, updates happen automatically available the next time
you log into the cloud.
– When you access a web-based application, you get the latest version without needing to pay
for or download an upgrade.
18
Advantages of Cloud Computing
• Unlimited storage capacity
– Cloud computing offers virtually limitless storage.
– Your computer's current 1 Tera Bytes hard drive is small compared to the hundreds of Peta
Bytes available in the cloud.
• Increased data reliability
– Unlike desktop computing, in which if a hard disk crashes and destroy all your valuable
data, a computer crashing in the cloud should not affect the storage of your data.
• if your personal computer crashes, all your data is still out there in the cloud, still accessible
– In a world where few individual desktop PC users back up their data on a regular basis,
cloud computing is a data-safe computing platform. For e.g. Dropbox, Skydrive
19
Advantages of Cloud Computing
• Universal information access
– That is not a problem with cloud computing, because you do not take your
documents with you.
– Instead, they stay in the cloud, and you can access them whenever you have a
computer and an Internet connection
– Documents are instantly available from wherever you are.
• Latest version availability
– When you edit a document at home, that edited version is what you see when
you access the document at work.
– The cloud always hosts the latest version of your documents as long as you are
connected, you are not in danger of having an outdated version.
20
Advantages of Cloud Computing
• Easier group collaboration
– Sharing documents leads directly to better collaboration.
– Many users do this as it is an important advantages of cloud computing
multiple users can collaborate easily on documents and projects
• Device independence
– You are no longer tethered to a single computer or network.
– Changes to computers, applications and documents follow you through the
cloud.
– Move to a portable device, and your applications and documents are still
available.
21
Disadvantages of Cloud Computing
• Requires a constant internet connection
– Cloud computing is impossible if you cannot connect to the Internet.
– Since you use the Internet to connect to both your applications and documents, if you do not
have an Internet connection you cannot access anything, even your own documents.
– A dead Internet connection means no work and in areas where Internet connections are few or
inherently unreliable, this could be a deal-breaker.
• Does not work well with low-speed connections
– Similarly, a low-speed Internet connection, such as that found with dial-up services, makes
cloud computing painful at best and often impossible.
– Web-based applications require a lot of bandwidth to download, as do large documents.
22
Disadvantages of Cloud Computing
• Features might be limited
– This situation is bound to change, but today many web-based applications simply
are not as full-featured as their desktop-based applications.
• For example, you can do a lot more with Microsoft PowerPoint than with Google
Presentation's web-based offering
• Can be slow
– Even with a fast connection, web-based applications can sometimes be slower than
accessing a similar software program on your desktop PC.
– Everything about the program, from the interface to the current document, has to
be sent back and forth from your computer to the computers in the cloud.
– If the cloud servers happen to be backed up at that moment, or if the Internet is
having a slow day, you would not get the instantaneous access you might expect
from desktop applications.
23
Disadvantages of Cloud Computing
• Stored data might not be secured
– With cloud computing, all your data is stored on the cloud.
• The questions is How secure is the cloud?
– Can unauthorized users gain access to your confidential data ?
24
Disadvantages of Cloud Computing
• HPC Systems High performance system
– Not clear that you can run compute-intensive HPC applications that use MPI/OpenMP!
– Scheduling is important with this type of application
• as you want all the VM to be co-located to minimize communication latency!
• General Concerns
– Each cloud systems uses different protocols and different APIs
• may not be possible to run applications between cloud based systems
– Amazon has created its own DB system (not SQL 92), and workflow system (many
popular workflow systems out there)
• so your normal applications will have to be adapted to execute on these platforms.
25
Evolution of Cloud Computing
Business drivers for adopting cloud
computing
Reasons
• The main reason for interest in cloud computing is due to the fact that
public clouds can significantly reduce IT costs.
• From and end user perspective cloud computing gives the illusion of
potentially infinite capacity with ability to scale rapidly and pay only for
the consumed resource.
• In contrast, provisioning for peak capacity is a necessity within private
data centers, leading to a low average utilization of 5-20 percent.
27
IaaS Economics
In house server Cloud server
Purchase Cost $9600 (x86,3QuadCore,12GB 0
RAM, 300GB HD)
Cost/hr (over 3 years) $0.36 $0.68
Cost ratio: Cloud/In house 1.88
29
Benefits for the end user while using public cloud
• In order to enhance portability from one public cloud to another, several
organizations such as Cloud Computing Interoperability Forum and Open
Cloud Consortium are coming up with standards for portability.
• For e.g. Amazon EC2 and Eucalyptus share the same API interface.
• Software startups benefit tremendously by renting computing and storage
infrastructure on the cloud instead of buying them as they are uncertain
about their own future.
30
Benefits for Small and Medium Businesses (<250 employees)
Source: http://www.microsoft.com/en-us/news/presskits/telecom/docs/SMBCloud.pdf
31
Benefits of private cloud
• Cost of 1 server with 12 cores and 12 GB RAM is far lower than the
cost of 12 servers having 1 core and 1 GB RAM.
• Confidentiality of data is preserved
• Virtual machines are cheaper than actual machines
• Virtual machines are faster to provision than actual machines
32
Economics of PaaS vs IaaS
• Consider a web application that needs to be available 24X7, but
where the transaction volume is unpredictable and can vary rapidly
• Using an IaaS cloud, a minimal number of servers would need to be
provisioned at all times to ensure availability
• In contrast, merely deploying the application on PaaS cloud costs
nothing. Depending upon the usage, costs are incurred.
• The PaaS cloud scales automatically to successfully handle increased
requests to the web application.
Source: Enterprise Cloud Computing by Gautam Shroff
33
PaaS benefits
• No need for the user to handle scaling and load balancing of
requests among virtual machines
• PaaS clouds also provide web based Integrated Development
Environment for development and deployment of application on
the PaaS cloud.
• Easier to migrate code from development environment to the
actual production environment.
• Hence developers can directly write applications on the cloud and
don’t have to buy separate licenses of IDE.
34
SaaS benefits
• Users subscribe to web services and web applications instead of
buying and licensing software instances.
• For e.g. Google Docs can be used for free, instead of buying
document reading softwares such as Microsoft Word.
• Enterprises can use web based SaaS Content Relationship
Management applications, instead of buying servers and installing
CRM softwares and associated databases on them.
Customer relationship management
35
Benefits, as perceived by the IT industry
36
Factors driving investment in cloud
Source: http://www.cloudtweaks.com/2012/01/infographic-whats-driving-investment-in-cloud-
computing/
37
Factors driving investment in cloud
Source: http://www.cloudtweaks.com/2012/01/infographic-whats-driving-investment-in-cloud-computing/
38
Purpose of cloud computing in organizations
• Providing an IT platform for business processes involving multiple organizations
• Backing up data Enterprise resource planning
• Running CRM, ERP, or supply chain management applications
• Providing personal productivity and collaboration tools to employees
• Developing and testing software
• Storing and archiving large files (e.g., video or audio)
• Analyzing customer or operations data
• Running e-business or e-government web sites
Source: http://askvisory.com/research/key-drivers-of-cloud-computing-activity/
39
Purpose of cloud computing in organizations
• Analyzing data for research and development Put an end
• Meeting spikes in demand on our web site or internal systems
• Processing and storing applications or other forms
• Running data-intensive batch applications (e.g., data conversion, risk modeling,
graphics rendering)
• Sharing information with the government or regulators
• Providing consumer entertainment, information and communication (e.g., music,
video, photos, social networks)
Source: http://askvisory.com/research/key-drivers-of-cloud-computing-activity/
40
Top cloud applications that are driving cloud adaptation
• Mail and Messaging
• Archiving
• Backup
• Storage
• Security
• Virtual Servers
• CRM (Customer Relationship Management)
• Collaboration across enterprises
• Hosted PBX (Private Branch Exchange)
• Video Conferencing
Source: http://www.itnewsafrica.com/2012/09/ten-drivers-of-cloud-computing-for-south-african-businesses/
41
42
CLOUD COMPUTING
CLOUD COMPUTING ARCHITECTURE
Source: http://www.sei.cmu.edu/library/assets/presentations/Cloud%20Computing%20Architecture%20-%20Gerald%20Kaefer.pdf
2
Major building blocks of Cloud Computing
Architecture
• Technical Architecture:
– Structuring according to XaaS stack
– Adopting cloud computing paradigms
– Structuring cloud services and cloud components
– Showing relationships and external endpoints
– Middleware and communication
– Management and security
• Deployment Operation Architecture:
– Geo-location check (Legal issues, export control)
– Operation and Monitoring
Ref: http://www.sei.cmu.edu/library/assets/presentations/Cloud%20Computing%20Architecture%20-%20Gerald%20Kaefer.pdf
3
Cloud Computing Architecture - XaaS
Source: http://www.sei.cmu.edu/library/assets/presentations/Cloud%20Computing%20Architecture%20-%20Gerald%20Kaefer.pdf
4
XaaS Stack views: Customer view vs Provider view
Source: http://www.sei.cmu.edu/library/assets/presentations/Cloud%20Computing%20Architecture%20-%20Gerald%20Kaefer.pdf
5
Microsoft Azure vs Amazon EC2
Source: http://www.sei.cmu.edu/library/assets/presentations/Cloud%20Computing%20Architecture%20-%20Gerald%20Kaefer.pdf
6
Architecture for elasticity
Source: http://www.sei.cmu.edu/library/assets/presentations/Cloud%20Computing%20Architecture%20-%20Gerald%20Kaefer.pdf
7
Service Models (XaaS)
8
Service Models (XaaS)
9
Service Models (XaaS)
Source: Cloud Security and Privacy: An Enterprise Perspective on Risks and Compliance by Tim Mather and Subra Kumaraswamy
10
Service Models (XaaS)
Most common examples of XaaS are
Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)
11
Requirements of CSP (Cloud Service Provider)
• Increase productivity
• Increase end user satisfaction
• Increase innovation
• Increase agility
12
Service Models (XaaS)
• Broad network access (cloud) + resource pooling (cloud) +
business-driven infrastructure on-demand (SOI) + service-
orientation (SOI) = XaaS
• Xaas fulfils all the 4 demands!
Source: Understanding the Cloud Computing Stack: PaaS, SaaS, IaaS © Diversity Limited, 2011
13
Classical Service Model
All the Layers(H/W, Operating System, Development Tools, Applications) Managed by the
Users
Managed by user
Runtime
Each system is designed and funded for a specific business activity: custom build-to-order
Middleware
Systems are deployed as a vertical stack of “layers” which are tightly coupled, so no single O/S
14
Key impact of cloud computing for IT function:
From Legacy IT to Evergreen IT
Simplified IT Stack Simplified IT Stack
Application Application
Legacy IT Evergreen IT
15
Classic Model vs. XaaS
16
Client Server Architecture
Source: Wikipedia
17
18
CLOUD COMPUTING
CLOUD COMPUTING ARCHITECTURE
Source: Wikipedia
2
Client server architecture
• Consists of one or more load balanced servers
servicing requests sent by the clients
• Clients and servers exchange message in
request-response fashion
• Client is often a thin client or a machine with
low computational capabilities
• Server could be a load balanced cluster or a
stand alone machine.
3
Three Tier Client-Server Architecture
Source: Wikipedia
4
Client Server model vs. Cloud model
Client server model Cloud computing model
• Simple service model where • Variety of complex service
server services client models, such as, IaaS, PaaS,
requests SaaS can be provided
• May/may not be load • Load balanced
balanced • Theoretically infinitely
• Scalable to some extent in a scalable
cluster environment. • Virtualization is the core
• No concept of virtualization concept
5
Cloud Services
Source : http://www.opengroup.org/soa/source-book/socci/extend.htm#figure2
6
Cloud service models
Source: http://www.cs.helsinki.fi/u/epsavola/seminaari/Cloud%20Service%20Models.pdf
7
Simplified description of cloud service models
SaaS applications are designed for end users and are
delivered over the web
PaaS is the set of tools and services designed to make
coding and deploying applications quickly and efficiently
IaaS is the hardware and software that powers it all –
servers, storage, network, operating systems
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
8
Transportation Analogy
• By itself, infrastructure isn’t useful – it just sits there waiting
for someone to make it productive in solving a particular
problem. Imagine the Interstate transportation system in
the U.S. Even with all these roads built, they wouldn’t be
useful without cars and trucks to transport people and
goods. In this analogy, the roads are the infrastructure and
the cars and trucks are the platform that sits on top of the
infrastructure and transports the people and goods. These
goods and people might be considered the software and
information in the technical realm
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
9
Software as a Service
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
10
SaaS characteristics
• Web access to commercial software
• Software is managed from central location
• Software is delivered in a ‘one to many’ model
• Users not required to handle software upgrades and patches
• Application Programming Interfaces (API) allow for integration
between different pieces of software.
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
11
Applications where SaaS is used
• Applications where there is significant interplay between
organization and outside world. E.g. email newsletter campaign
software
• Applications that have need for web or mobile access. E.g. mobile
sales management software
• Software that is only to be used for a short term need.
• Software where demand spikes significantly. E.g. Tax/Billing
softwares. Put an end
• E.g. of SaaS: Sales Force Customer Relationship Management (CRM)
software
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
12
Applications where SaaS may not be the best
option
• Applications where extremely fast processing of
real time data is needed
• Applications where legislation or other regulation
does not permit data being hosted externally
• Applications where an existing on-premise solution
fulfills all of the organization’s needs
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
13
Platform as a Service
• Platform as a Service (PaaS) brings the benefits that
SaaS bought for applications, but over to the software
development world. PaaS can be defined as a
computing platform that allows the creation of web
applications quickly and easily and without the
complexity of buying and maintaining the software and
infrastructure underneath it.
• PaaS is analogous to SaaS except that, rather than being
software delivered over the web, it is a platform for the
creation of software, delivered over the web.
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
14
Characteristics of PaaS
Services to develop, test, deploy, host and maintain applications in the same
integrated development environment. All the varying services needed to fulfill the
application development process.
Web based user interface creation tools help to create, modify, test and deploy
different UI scenarios.
Multi-tenant architecture where multiple concurrent users utilize the same
development application.
Built in scalability of deployed software including load balancing and failover.
Integration with web services and databases via common standards.
Support for development team collaboration – some PaaS solutions include
project planning and communication tools.
Tools to handle billing and subscription management
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
15
Scenarios where PaaS is used
PaaS is especially useful in any situation where multiple developers
will be working on a development project or where other external
parties need to interact with the development process
PaaS is useful where developers wish to automate testing and
deployment services.
The popularity of agile software development, a group of software
development methodologies based on iterative and incremental
development, will also increase the uptake of PaaS as it eases the
difficulties around rapid development and iteration of software.
PaaS Examples: Microsoft Azure, Google App Engine
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
16
Scenarios where PaaS is not ideal
• Where the application needs to be highly portable in terms
of where it is hosted.
• Where proprietary languages or approaches would impact
on the development process
• Where a proprietary language would hinder later moves to
another provider – concerns are raised about vendor lock
in
• Where application performance requires customization of
the underlying hardware and software
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
17
Infrastructure as a Service
18
Characteristics of IaaS
• Resources are distributed as a service
• Allows for dynamic scaling
• Has a variable cost, utility pricing model
• Generally includes multiple users on a single
piece of hardware
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
19
Scenarios where IaaS makes sense
Where demand is very volatile – any time there are significant spikes
and troughs in terms of demand on the infrastructure
For new organizations without the capital to invest in hardware
Where the organization is growing rapidly and scaling hardware
would be problematic
Where there is pressure on the organization to limit capital
expenditure and to move to operating expenditure
For specific line of business, trial or temporary infrastructural needs
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
20
Scenarios where IaaS may not be the best
option
• Where regulatory compliance makes the
offshoring or outsourcing of data storage and
processing difficult
• Where the highest levels of performance are
required, and on-premise or dedicated hosted
infrastructure has the capacity to meet the
organization’s needs
Source: http://broadcast.rackspace.com/hosting_knowledge/whitepapers/Understanding-the-Cloud-Computing-Stack.pdf
21
SaaS providers
Source: http://www.cs.helsinki.fi/u/epsavola/seminaari/Cloud%20Service%20Models.pdf
22
Feature comparison of PaaS providers
Source: http://www.cs.helsinki.fi/u/epsavola/seminaari/Cloud%20Service%20Models.pdf
23
Feature comparison of IaaS providers
Source: http://www.cs.helsinki.fi/u/epsavola/seminaari/Cloud%20Service%20Models.pdf
24
XaaS ADR MOV SSN
Managed by user
SaaS PaaS IaaS
Managed by user
Applications Applications Applications
Data Data Data
Runtime Runtime Runtime
25
Role of Networking in cloud computing
• In cloud computing, network resources can be
provisioned dynamically.
• Some of the networking concepts that form the core
of cloud computing are Virtual Local Area Networks,
Virtual Private Networks and the different protocol
layers.
• Examples of tools that help in setting up different
network topologies and facilitate various network
configurations are OpenSSH, OpenVPN etc.
Source: http://www.slideshare.net/alexamies/networking-concepts-and-tools-for-the-cloud
26
Networking in different cloud models
Source: http://www.slideshare.net/alexamies/networking-concepts-and-tools-for-the-cloud
27
Network Function Virtualization
Definition: “Network Functions Virtualisation aims to transform the way that network
operators architect networks by evolving standard IT virtualisation technology to
consolidate many network equipment types onto industry standard high volume servers,
switches and storage, which could be located in Datacentres, Network Nodes and in the end
user premises, as illustrated in Figure 1. It involves the implementation of network functions
in software that can run on a range of industry standard server hardware, and that can be
moved to, or instantiated in, various locations in the network as required, without the need
for installation of new equipment.”
Source: https://portal.etsi.org/nfv/nfv_white_paper.pdf
Network Function Virtualization
Source: https://portal.etsi.org/nfv/nfv_white_paper.pdf
30
CLOUD COMPUTING
ARCHITECTURE - Deployment Models
• Public Cloud
• Private Cloud
• Hybrid Cloud
• Community Cloud
2
Public Cloud
Cloud infrastructure is provisioned for open use by the general
public. It may be owned, managed, and operated by a business,
academic, or government organization, or some combination of
them. It exists on the premises of the cloud provider.
3
Public Cloud
• In Public setting, the provider's computing and storage resources are
potentially large; the communication links can be assumed to be
implemented over the public Internet; and the cloud serves a diverse pool
of clients (and possibly attackers).
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
4
Public Cloud
• Workload locations are hidden from clients (public):
– In the public scenario, a provider may migrate a subscriber's workload,
whether processing or data, at any time.
– Workload can be transferred to data centres where cost is low
– Workloads in a public cloud may be relocated anywhere at any time
unless the provider has offered (optional) location restriction policies
• Risks from multi-tenancy (public):
– A single machine may be shared by the workloads of any combination
of subscribers (a subscriber's workload may be co-resident with the
workloads of competitors or adversaries)
• Introduces both reliability and security risk
5
Public Cloud
• Organizations considering the use of an on-site private cloud
should consider:
– Network dependency (public):
• Subscribers connect to providers via the public Internet.
• Connection depends on Internet’s Infrastructure like
– Domain Name System (DNS) servers
– Router infrastructure,
– Inter-router links
6
Public Cloud
• Limited visibility and control over data regarding security (public):
– The details of provider system operation are usually considered proprietary
information and are not divulged to subscribers.
– In many cases, the software employed by a provider is usually proprietary and not
available for examination by subscribers
– A subscriber cannot verify that data has been completely deleted from a
provider's systems.
• Elasticity: illusion of unlimited resource availability (public):
– Public clouds are generally unrestricted in their location or size.
– Public clouds potentially have high degree of flexibility in the movement of
subscriber workloads to correspond with available resources.
7
Public Cloud
8
Private Cloud
• The cloud infrastructure is provisioned for exclusive use by a single organization
comprising multiple consumers (e.g., business units). It may be owned, managed,
and operated by the organization, a third party, or some combination of them, and
it may exist on or off premises.
9
Private Cloud
• Contrary to popular belief, private cloud may exist off premises and
can be managed by a third party. Thus, two private cloud scenarios
exist, as follows:
• On-site Private Cloud
– Applies to private clouds implemented at a customer’s
premises.
• Outsourced Private Cloud
– Applies to private clouds where the server side is outsourced to
a hosting company.
10
On-site Private Cloud
The security perimeter extends around both the subscriber’s on-site
resources and the private cloud’s resources.
Security perimeter does not guarantees control over the private cloud’s
resources but subscriber can exercise control over the resources.
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
11
On-site Private Cloud
• Organizations considering the use of an on-site private cloud should consider:
– Network dependency (on-site-private):
– Subscribers still need IT skills (on-site-private):
• Subscriber organizations will need the traditional IT skills required to
manage user devices that access the private cloud, and will require
cloud IT skills as well.
– Workload locations are hidden from clients (on-site-private):
• To manage a cloud's hardware resources, a private cloud must be able
to migrate workloads between machines without inconveniencing
clients. With an on-site private cloud, however, a subscriber
organization chooses the physical infrastructure, but individual clients
still may not know where their workloads physically exist within the
subscriber organization's infrastructure
12
On-site Private Cloud
• Risks from multi-tenancy (on-site-private):
– Workloads of different clients may reside concurrently on the same
systems and local networks, separated only by access policies
implemented by a cloud provider's software. A flaw in the software or
the policies could compromise the security of a subscriber
organization by exposing client workloads to one another
• Data import/export, and performance limitations (on-site-private):
– On-demand bulk data import/export is limited by the on-site private
cloud's network capacity, and real-time or critical processing may be
problematic because of networking limitations.
13
On-site Private Cloud
• Potentially strong security from external threats (on-site-private):
– In an on-site private cloud, a subscriber has the option of implementing an
appropriately strong security perimeter to protect private cloud resources against
external threats to the same level of security as can be achieved for non-cloud
resources.
• Significant-to-high up-front costs to migrate into the cloud (on-site-private):
– An on-site private cloud requires that cloud management software be installed on
computer systems within a subscriber organization. If the cloud is intended to support
process-intensive or data-intensive workloads, the software will need to be installed on
numerous commodity systems or on a more limited number of high-performance
systems. Installing cloud software and managing the installations will incur significant
up-front costs, even if the cloud software itself is free, and even if much of the hardware
already exists within a subscriber organization.
14
On-site Private Cloud
15
Outsourced Private Cloud
• Outsourced private cloud has two security perimeters, one implemented
by a cloud subscriber (on the right) and one implemented by a provider.
• Two security perimeters are joined by a protected communications link.
• The security of data and processing conducted in the outsourced private
cloud depends on the strength and availability of both security perimeters
and of the protected communication link.
16
Outsourced Private Cloud
• Organizations considering the use of an outsourced private cloud should
consider:
– Network Dependency (outsourced-private):
• In the outsourced private scenario, subscribers may have an option to
provision unique protected and reliable communication links with the
provider.
– Workload locations are hidden from clients (outsourced-private):
– Risks from multi-tenancy (outsourced-private):
• The implications are the same as those for an on-site private cloud.
17
Outsourced Private Cloud
• Data import/export, and performance limitations (outsourced-private):
– On-demand bulk data import/export is limited by the network capacity
between a provider and subscriber, and real-time or critical processing
may be problematic because of networking limitations. In the outsourced
private cloud scenario, however, these limits may be adjusted, although
not eliminated, by provisioning high-performance and/or high-reliability
networking between the provider and subscriber.
• Potentially strong security from external threats (outsourced-private):
– As with the on-site private cloud scenario, a variety of techniques exist to
harden a security perimeter. The main difference with the outsourced
private cloud is that the techniques need to be applied both to a
subscriber's perimeter and provider's perimeter, and that the
communications link needs to be protected.
18
Outsourced Private Cloud
• Modest-to-significant up-front costs to migrate into the cloud (outsourced-
private):
– In the outsourced private cloud scenario, the resources are provisioned by
the provider
– Main start-up costs for the subscriber relate to:
• Negotiating the terms of the service level agreement (SLA)
• Possibly upgrading the subscriber's network to connect to the
outsourced private cloud
• Switching from traditional applications to cloud-hosted applications,
• Porting existing non-cloud operations to the cloud
• Training
19
Outsourced Private Cloud
20
Community Cloud
Cloud infrastructure is provisioned for exclusive use by a specific community of
consumers from organizations that have shared concerns (e.g., mission, security
requirements, policy, and compliance considerations). It may be owned,
managed, and operated by one or more of the organizations in the community,
a third party, or some combination of them, and it may exist on or off premises.
21
On-site Community Cloud
• Community cloud is made up of a set of participant organizations. Each participant
organization may provide cloud services, consume cloud services, or both
• At least one organization must provide cloud services
• Each organization implements a security perimeter
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
22
On-site Community Cloud
• The participant organizations are connected via links between the
boundary controllers that allow access through their security perimeters
• Access policy of a community cloud may be complex
– Ex. :if there are N community members, a decision must be made,
either implicitly or explicitly, on how to share a member's local cloud
resources with each of the other members
– Policy specification techniques like role-based access control (RBAC),
attribute-based access control can be used to express sharing policies.
23
On-site Community Cloud
• Organizations considering the use of an on-site community cloud should consider:
– Network Dependency (on-site community):
• The subscribers in an on-site community cloud need to either
provision controlled inter-site communication links or use
cryptography over a less controlled communications media (such
as the public Internet).
• The reliability and security of the community cloud depends on
the reliability and security of the communication links.
24
On-site Community Cloud
• Subscribers still need IT skills (on-site-community).
– Organizations in the community that provides cloud resources, requires IT
skills similar to those required for the on-site private cloud scenario except
that the overall cloud configuration may be more complex and hence
require a higher skill level.
– Identity and access control configurations among the participant
organizations may be complex
• Workload locations are hidden from clients (on-site-community):
– Participant Organizations providing cloud services to the community cloud
may wish to employ an outsourced private cloud as a part of its
implementation strategy.
25
On-site Community Cloud
• Data import/export, and performance limitations (on-site-community):
– The communication links between the various participant organizations in
a community cloud can be provisioned to various levels of performance,
security and reliability, based on the needs of the participant
organizations. The network-based limitations are thus similar to those of
the outsourced-private cloud scenario.
• Potentially strong security from external threats (on-site-community):
– The security of a community cloud from external threats depends on the
security of all the security perimeters of the participant organizations and
the strength of the communications links. These dependencies are
essentially similar to those of the outsourced private cloud scenario, but
with possibly more links and security perimeters.
26
On-site Community Cloud
27
Outsourced Community Cloud
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
28
Outsourced Community Cloud
• Organizations considering the use of an on-site community cloud
should consider:
• Network dependency (outsourced-community):
– The network dependency of the outsourced community cloud is
similar to that of the outsourced private cloud. The primary
difference is that multiple protected communications links are
likely from the community members to the provider's facility.
• Workload locations are hidden from clients (outsourced-
community).
– Same as the outsourced private cloud
29
Outsourced Community Cloud
• Risks from multi-tenancy (outsourced-community):
– Same as the on-site community cloud
• Data import/export, and performance limitations (outsourced-
community):
– Same as outsourced private cloud
• Potentially strong security from external threats (outsourced-
community):
– Same as the on-site community cloud
• Modest-to-significant up-front costs to migrate into the cloud
(outsourced-community):
• Same as outsourced private cloud
30
Outsourced Community Cloud
31
Hybrid Cloud
• The cloud infrastructure is a composition of two or more distinct cloud
infrastructures (private, community, or public) that remain unique entities,
but are bound together by standardized or proprietary technology that
enables data and application portability
32
Hybrid Cloud
• A hybrid cloud is composed of two or more private, community, or public
clouds.
• They have significant variations in performance, reliability, and security
properties depending upon the type of cloud chosen to build hybrid cloud.
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
33
Hybrid Cloud
34
35
CLOUD COMPUTING
Virtualization
PROF. SOUMYA K. GHOSH
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
IIT KHARAGPUR
IaaS – Infrastructure as a Service
• What does a subscriber get?
– Access to virtual computers, network-accessible storage, network
infrastructure components such as firewalls, and configuration services.
2
IaaS Provider/Subscriber Interaction Dynamics
• The provider has a number of available virtual machines
(vm’s) that it can allocate to clients.
– Client A has access to vm1 and vm2, Client B has access to vm3 and Client C has
access to vm4, vm5 and vm6
– Provider retains only vm7 through vmN
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
3
IaaS Component Stack and Scope of Control
• IaaS component stack comprises of hardware, operating system,
middleware, and applications layers.
• Operating system layer is split into two layers.
– Lower (and more privileged) layer is occupied by the Virtual Machine Monitor (VMM),
which is also called the Hypervisor
– Higher layer is occupied by an operating system running within a VM called a guest
operating system
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
4
IaaS Component Stack and Scope of Control
• In IaaS Cloud provider maintains total control over the
physical hardware and administrative control over the
hypervisor layer
• Subscriber controls the Guest OS, Middleware and
Applications layers.
• Subscriber is free (using the provider's utilities) to load any
supported operating system software desired into the VM.
• Subscriber typically maintains complete control over the
operation of the guest operating system in each VM.
5
IaaS Component Stack and Scope of Control
6
IaaS Cloud Architecture
• Logical view of IaaS cloud structure and operation
Source: LeeBadger, and Tim Grance “NIST DRAFT Cloud Computing Synopsis and Recommendations “
7
IaaS Cloud Architecture
• Three-level hierarchy of components in IaaS cloud systems
– Top level is responsible for central control
– Middle level is responsible for management of possibly large
computer clusters that may be geographically distant from one
another
– Bottom level is responsible for running the host computer systems on
which virtual machines are created.
• Subscriber queries and commands generally flow into the system
at the top and are forwarded down through the layers that either
answer the queries or execute the commands
8
IaaS Cloud Architecture
• Cluster Manager can be geographically
distributed
• Within a cluster manger computer manger is
connected via high speed network.
9
Operation of the Cloud Manager
• Cloud Manager is the public access point to the cloud where
subscribers sign up for accounts, manage the resources they rent
from the cloud, and access data stored in the cloud.
• Cloud Manager has mechanism for:
– Authenticating subscribers
– Generating or validating access credentials that subscriber uses when
communicating with VMs.
– Top-level resource management.
• For a subscriber’s request cloud manager determines if the cloud
has enough free resources to satisfy the request
10
Data Object Storage (DOS)
11
Operation of the Cluster Managers
• Each Cluster Manager is responsible for the operation of a collection of
computers that are connected via high speed local area networks
• Cluster Manager receives resource allocation commands and queries
from the Cloud Manager, and calculates whether part or all of a
command can be satisfied using the resources of the computers in the
cluster.
• Cluster Manager queries the Computer Managers for the computers in
the cluster to determine resource availability, and returns messages to
the Cloud Manager
12
Operation of the Cluster Managers
• Directed by the Cloud Manager, a Cluster Manager then
instructs the Computer Managers to perform resource
allocation, and reconfigures the virtual network infrastructure
to give the subscriber uniform access.
• Each Cluster Manager is connected to Persistent Local Storage
(PLS)
• PLS provide persistent disk-like storage to Virtual Machine
13
Operation of the Computer Managers
• At the lowest level in the hierarchy computer manger runs on each
computer system and uses the concept of virtualization to provide Virtual
Machines to subscribers
• Computer Manger maintains status information including how many
virtual machines are running and how many can still be started
• Computer Manager uses the command interface of its hypervisor to start,
stop, suspend, and reconfigure virtual machines
14
Virtualization
• Virtualization is a broad term (virtual memory, storage, network, etc)
• Focus: Platform virtualization
• Virtualization basically allows one computer to do the job of multiple computers, by sharing the
resources of a single hardware across multiple environments
Virtual Virtual
App. A App. B App. C App. D Container Container
App. A App. B App. C App. D
Operating System
Virtualization Layer
Hardware
‘Non-virtualized’ system Hardware
Virtualized system
A single OS controls all hardware platform
It makes it possible to run multiple Virtual
resources
Containers on a single physical platform
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
15
Virtualization
• Virtualization is way to run multiple operating systems and user applications on the
same hardware
– E.g., run both Windows and Linux on the same laptop
• How is it different from dual-boot?
– Both OSes run simultaneously
• The OSes are completely isolated from each other
16
Hypervisor or Virtual Machine Monitor
Research Paper :Popek and Goldberg, "Formal requirements for virtualizable third generation architectures“, CACM 1974
(http://portal.acm.org/citation.cfm?doid=361011.361073).
A hypervisor or virtual machine monitor runs the guest OS directly on the CPU. (This only works if
the guest OS uses the same instruction set as the host OS.) Since the guest OS is running in user
mode, privileged instructions must be intercepted or replaced. This further imposes restrictions on
the instruction set for the CPU, as observed in a now-famous paper by Popek and Goldberg
identify three goals for a virtual machine architecture:
• Equivalence: The VM should be indistinguishable from the underlying hardware.
• Resource control: The VM should be in complete control of any virtualized resources.
• Efficiency: Most VM instructions should be executed directly on the underlying CPU without
involving the hypervisor.
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
17
Hypervisor or Virtual Machine Monitor
Popek and Goldberg describe (and give a formal proof of) the requirements for the CPU's
instruction set to allow these properties. The main idea here is to classify instructions into
• privileged instructions, which cause a trap if executed in user mode, and
• sensitive instructions, which change the underlying resources (e.g. doing I/O or changing the
page tables) or observe information that indicates the current privilege level (thus exposing
the fact that the guest OS is not running on the bare hardware).
• The former class of sensitive instructions are called control sensitive and the latter behavior
sensitive in the paper, but the distinction is not particularly important.
What Popek and Goldberg show is that we can only run a virtual machine with all three desired
properties if the sensitive instructions are a subset of the privileged instructions. If this is the case,
then we can run most instructions directly, and any sensitive instructions trap to the hypervisor
which can then emulate them (hopefully without much slowdown).
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
18
VMM and VM
Equivalence
Resource Control
Efficiency
Privileged instructions
Control sensitive
Behavior sensitive
• For any conventional third generation computer, a VMM may be constructed if the set of sensitive
instructions for that computer is a subset of the set of privileged instructions
• A conventional third generation computer is recursively virtualizable if it is virtualizable and a VMM without
any timing dependencies can be constructed for it.
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
19
Approaches to Server Virtualization
20
Evolution of Software Solutions
• 1st Generation: Full • 2nd Generation: • 3rd Generation: Silicon-
virtualization (Binary Para-virtualization based (Hardware-
rewriting) – Cooperative assisted) virtualization
– Software Based virtualization – Unmodified guest
– VMware and – Modified guest – VMware and Xen on
Microsoft – VMware, Xen virtualization-aware
Virtual
… Virtual
… hardware platforms
Machine Machine VM VM
Virtual Virtual
Machine… Machine
Dynamic Translation Hypervisor Hypervisor
Operating System
Hardware Hardware
Hardware Virtualization Logic
Time
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
21
Full Virtualization
Virtual Machine
• 1st Generation offering of x86/x64 server
App. B
App. C
Guest OS
App. A
virtualization
Device Drivers
• Dynamic binary translation
– Emulation layer talks to an operating system which
Emulated
talks to the computer hardware
Hardware
– Guest OS doesn't see that it is used in an emulated
environment Device Drivers
• All of the hardware is emulated including the CPU Host OS
22
Full Virtualization - Advantages
• Emulation layer
– Isolates VMs from the host OS and from each other
– Controls individual VM access to system resources, preventing an unstable VM
from impacting system performance
• Total VM portability
– By emulating a consistent set of system hardware, VMs have the ability to
transparently move between hosts with dissimilar hardware without any
problems
• It is possible to run an operating system that was developed for another
architecture on your own architecture
• A VM running on a Dell server can be relocated to a Hewlett-Packard server
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
23
Full Virtualization - Drawbacks
• Hardware emulation comes with a performance price
• In traditional x86 architectures, OS kernels expect to run privileged code in
Ring 0
– However, because Ring 0 is controlled by the host OS, VMs are forced to execute
at Ring 1/3, which requires the VMM to trap and emulate instructions
• Due to these performance limitations, para-virtualization and hardware-
assisted virtualization were developed Application Ring 3
Application Ring 3
Guest OS Ring 1 / 3
Operating Ring 0 Virtual
System Ring 0
Machine
Traditional x86 Architecture Monitor
Full Virtualization
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
24
Para-Virtualization Server virtualization approaches
Virtual Machine
• Guest OS is modified and thus run kernel-level
App. B
App. C
Guest OS
App. A
operations at Ring 1 (or 3)
– Guest is fully aware of how to process privileged Device Drivers
instructions
– Privileged instruction translation by the VMM is no Specialized API
longer necessary Virtual Machine Monitor
– Guest operating system uses a specialized API to talk to
the VMM and, in this way, execute the privileged
Device Drivers
instructions
Hypervisor
• VMM is responsible for handling the virtualization
requests and putting them to the hardware
Hardware
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
25
Para-Virtualization
• Today, VM guest operating systems are para-virtualized using two different approaches:
– Recompiling the OS kernel
• Para-virtualization drivers and APIs must reside in the guest operating system kernel
• You do need a modified operating system that includes this specific API, requiring a compiling operating
systems to be virtualization aware
– Some vendors (such as Novell) have embraced para-virtualization and have provided para-virtualized
OS builds, while other vendors (such as Microsoft) have not
– Installing para-virtualized drivers
• In some operating systems it is not possible to use complete para-virtualization, as it requires a specialized
version of the operating system
• To ensure good performance in such environments, para-virtualization can be applied for individual devices
• For example, the instructions generated by network boards or graphical interface cards can be modified
before they leave the virtualized machine by using para-virtualized drivers
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
26
Server virtualization approaches
Hardware-assisted virtualization
Virtual Machine
App. B
App. C
Guest OS
App. A
• Guest OS runs at ring 0
• VMM uses processor extensions (such as Intel®- Device Drivers
27
Server virtualization approaches
Hardware-assisted virtualization
• Pros
– It allows to run unmodified OSs (so legacy OS can be run without
problems)
• Cons
– Speed and Flexibility
• An unmodified OS does not know it is running in a virtualized
environment and so, it can’t take advantage of any of the
virtualization features
– It can be resolved using para-virtualization partially
Source: www.dc.uba.ar/events/eci/2008/courses/n2/Virtualization-Introduction.ppt
28
Network Virtualization
Making a physical network appear as multiple logical ones
29
Why Virtualize ?
• Internet is almost “paralyzed”
– Lots of makeshift solutions (e.g. overlays)
– A new architecture (aka clean-slate) is needed
30
Related Concepts
• Virtual Private Networks (VPN)
– Virtual network connecting distributed sites
– Not customizable enough
• Overlay Networks
– Application layer virtual networks
– Not flexible enough
31
Network Virtualization Model
• Business Model
• Architecture
• Design Principles
• Design Goals
32
Business Model
Players Relationships
• Infrastructure Providers (InPs)
– Manage underlying physical networks
33
Architecture
34
Design Principles
Concurrence of multiple
Hierarchy of Roles
heterogeneous virtual networks
Introduces diversity Service Provider N
Infrastructure
Virtual Network N
Provider N+1
Recursion of virtual networks …
Opens the door for network virtualization Service Provider 1
economics Infrastructure
Virtual Network 1
Provider 2
35
Design Goals (1)
• Flexibility
– Service providers can choose
• arbitrary network topology,
• routing and forwarding functionalities,
• customized control and data planes
– No need for co-ordination with others
• IPv6 fiasco should never happen again
• Manageability
– Clear separation of policy from mechanism
– Defined accountability of infrastructure and service providers
– Modular management
36
Design Goals (2)
• Scalability
– Maximize the number of co-existing virtual networks
– Increase resource utilization and amortize CAPEX and OPEX
37
Design Goals (3)
• Programmability
– Of network elements e.g. routers
– Answer “How much” and “how”
– Easy and effective without being vulnerable to threats
• Heterogeneity
– Networking technologies
• Optical, sensor, wireless etc.
– Virtual networks
38
Design Goals (4)
• Experimental and Deployment Facility
– PlanetLab, GENI, VINI
– Directly deploy services in real world from the testing phase
• Legacy Support
– Consider the existing Internet as a member of the collection of
multiple virtual Internets
– Very important to keep all concerned parties satisfied
39
Definition
Network virtualization is a networking environment that allows
multiple service providers to dynamically compose multiple
heterogeneous virtual networks that co-exist together in
isolation from each other, and to deploy customized end-to-
end services on-the-fly as well as manage them on those
virtual networks for the end-users by effectively sharing and
utilizing underlying network resources leased from multiple
infrastructure providers.
40
Typical Approach
• Networking technology
– IP, ATM
• Layer of virtualization
• Architectural domain
– Network resource management, Spawning networks
• Level of virtualization
– Node virtualization, Full virtualization
41
42
Introduction to XML:
eXtensible Markup Language
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
XML ??
• Over time, the acronym “XML” has evolved to imply a growing family of
software tools/XML standards/ideas around
– How XML data can be represented and processed
– application frameworks (tools, dialects) based on XML
2
Presentation Outline
• What is XML (basic introduction)
– Language rules, basic XML processing
• Defining language dialects
– DTDs, schemas, and namespaces
• XML processing
– Parsers and parser interfaces
– XML-based processing tools
• XML messaging
– Why, and some issues/example
• Conclusions
3
What is XML?
• A syntax for “encoding” text-based data (words, phrases, numbers, ...)
• A text-based syntax. XML is written using printable Unicode characters (no explicit
binary data; character encoding issues)
• Extensible. XML lets you define your own elements (essentially data types), within
the constraints of the syntax rules
• Universal format. The syntax rules ensure that all XML processing software MUST
identically handle a given piece of XML data.
4
XML Declaration (“this is XML”) Binary encoding used in file
<?xml version="1.0" encoding="iso-8859-1"?>
<partorders
xmlns=“http://myco.org/Spec/partorders”>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
<desc> Gold sprockel grommets,
with matching hamster
</desc> What is XML: A Simple Example
<part number=“23-23221-a12” />
<quantity units=“gross”> 12 </quantity>
<deliveryDate date=“27aug1999-12:00h” />
</order>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
. . . Order something else . . .
</order>
</partorders>
5
Example Revisited
element tags attribute of this
<partorders quantity element
xmlns=“http://myco.org/Spec/partorders” >
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
<desc> Gold sprockel grommets,
with matching hamster
</desc>
<part number=“23-23221-a12” />
<quantity units=“gross”> 12 </quantity>
<deliveryDate date=“27aug1999-12:00h” />
</order>
<order ref=“x23-2112-2342”
date=“25aug1999-12:34:23h”>
. . . Order something else . . .
</order>
</partorders> Hierarchical, structured information
6
XML Data Model - A Tree
<partorders xmlns="..."> ref=
ref="..."> desc
text
<desc> ..text.. order
part
</desc>
<part />
quantity
<quantity /> partorders
text
<delivery-date /> xmlns=
delivery-date
</order>
<order ref=".." .../> order
</partorders> ref=
date=
7
XML: Why it's this way
• Simple (like HTML -- but not quite so simple)
– Strict syntax rules, to eliminate syntax errors
– syntax defines structure (hierarchically), and names structural parts (element names) -- it is self-describing
data
8
<?xml version="1.0" encoding="utf-8" ?> XML Processing
<transfers>
<fundsTransfer date="20010923T12:34:34Z">
<from type="intrabank">
<amount currency="USD"> 1332.32 </amount>
<transitID> 3211 </transitID>
<accountID> 4321332 </accountID>
<acknowledgeReceipt> yes </acknowledgeReceipt>
</from>
<to account="132212412321" />
</fundsTransfer>
<fundsTransfer date="20010923T12:35:12Z">
<from type="internal">
<amount currency="CDN" >1432.12 </amount>
<accountID> 543211 </accountID>
<acknowledgeReceipt> yes </acknowledgeReceipt>
</from>
<to account="65123222" />
</fundsTransfer> xml-simple.xml
</transfers>
9
XML Parser Processing Model
parser
interface
XML data XML-based
parser
application
The parser must verify that the XML data is syntactically correct.
Such data is said to be well-formed
– The minimal requirement to “be” XML
A parser MUST stop processing if the data isn’t well-formed
– E.g., stop processing and “throw an exception” to the XML-based application.
The XML 1.0 spec requires this behaviour
10
XML Processing Rules: Including Parts
Document Type
<?xml version="1.0" encoding="utf-8" ?> Declaration (DTD)
<!DOCTYPE transfers [
<!-- Here is an internal entity that encodes a bunch of
markup that we'd otherwise use in a document -->
Internal Entity
<!ENTITY messageHeader
"<header> declaration
<routeID> info generic to message route </routeID>
<encoding>how message is encoded </encoding>
</header> "
>
]> Entity reference
<transfers> &name;
&messageHeader;
<fundsTransfer date="20010923T12:34:34Z">
<from type="intrabank">
. . . Content omitted . . . xml-simple-intEntity.xml
</transfers>
11
XML Parser Processing Model
parser
interface
XML data XML-based
parser
application
DTD
12
XML Parsers, DTDs, and Internal Entities
The parser processes the DTD content, identifies the internal entities, and checks
that each entity is well-formed.
There are explicit syntax rules for DTD content -- well-formed XML must be correct
here also.
The “resolved” data object is then made available to the XML application
13
XML Processing Rules: External Entities
Put the entity in another file -- so it can be shared by multiple resources.
<?xml version="1.0" encoding="utf-8" ?>
External Entity
<!DOCTYPE transfers [
. . . declaration
<!ENTITY messageHeader
SYSTEM "http://www.somewhere.org/dir/head.xml"
>
Location given
]> via a URL
<transfers>
&messageHeader;
<fundsTransfer date="20010923T12:34:34Z">
<from type="intrabank">
. . . Content omitted . . .
</transfers> xml-simple-extEntity.xml
14
XML Parsers and External Entities
The parser processes the DTD content, identifies the external entities, and “tries” to
resolve them
But …. what if the parser can’t find the external entity (firewall?)?
15
Two types of XML parsers
Validating parser
– Must retrieve all entities and must process all DTD content. Will stop processing and
indicate a failure if it cannot
– There is also the implication that it will test for compatibility with other things in the DTD --
instructions that define syntactic rules for the document (allowed elements, attributes,
etc.). We’ll talk about these parts in the next section.
Non-validating parser
– Will try to retrieve all entities defined in the DTD, but will cease processing the DTD
content at the first entity it can’t find, But this is not an error -- the parser simply makes
available the XML data (and the names of any unresolved entities) to the
application.
16
XML Parser Processing Model
parser
interface
XML data XML-based
parser
application
Relationship/
behavior
depends on
parser nature Many parsers can operate in
either
DTD
validating or non-validating mode
(parameter-dependent)
17
Special Issues: Characters and Charsets
• XML specification defines what characters can be used as whitespace in tags: <element id
= “23.112” />
• What if you want to include characters not defined in the encoding charset (e.g., Greek characters
in an ISO-Latin-1 document):
18
Presentation Outline
• What is XML (basic introduction)
– Language rules, basic XML processing
• Defining language dialects
– DTDs, schemas, and namespaces
• XML processing
– Parsers and parser interfaces
– XML-based processing tools
• XML messaging
– Why, and some issues/example
• Conclusions
19
How do you define language dialects?
• Two ways of doing so:
– XML Document Type Declaration (DTD) -- Part of core XML spec.
– XML Schema -- New XML specification (2001), which allows for stronger constraints on XML documents.
• Schemas are more powerful than DTDs. They are often used for type validation, or for relating
database schemas to XML models
20
Example DTD (as part of document)
<!DOCTYPE transfers [
<!ELEMENT transfers (fundsTransfer)+ >
<!ELEMENT fundsTransfer (from, to) > xml-simple-valid.xml
<!ATTLIST fundsTransfer
date CDATA #REQUIRED>
<!ELEMENT from (amount, transitID?, accountID,
acknowledgeReceipt ) >
<!ATTLIST from
type (intrabank|internal|other) #REQUIRED>
<!ELEMENT amount (#PCDATA) >
. . . Omitted DTD content . . .
<!ELEMENT to EMPTY >
<!ATTLIST to
account CDATA #REQUIRED>
]>
<transfers>
<fundsTransfer date="20010923T12:34:34Z">
. . . As with previous example . . .
21
Example “External” DTD
Reference is using a variation on the DOCTYPE:
simple.dtd
<!DOCTYPE transfers SYSTEM
"http://www.foo.org/hereitis/simple.dtd” >
<transfers>
<fundsTransfer date="20010923T12:34:34Z">
. . . As with previous example . . .
. . .
</transfers>
22
23
Introduction to XML:
eXtensible Markup Language
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
XML Schemas
• A new specification (2001) for specifying validation rules for XML
Specs: http://www.w3.org/XML/Schema
Best-practice: http://www.xfront.com/BestPracticesHomepage.html
2
XML Schema version of our DTD (Portion)
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="accountID" type="xs:string"/>
<xs:element name="acknowledgeReceipt" type="xs:string"/>
<xs:complexType name="amountType">
<xs:simpleContent>
<xs:restriction base="xs:string">
<xs:attribute name="currency" use="required">
<xs:simpleType>
<xs:restriction base="xs:NMTOKEN">
<xs:enumeration value="USD"/>
. . . (some stuff omitted) . . .
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="fromType">
<xs:sequence>
simple.xsd
<xs:element name="amount" type="amountType"/>
<xs:element ref="transitID" minOccurs="0"/>
<xs:element ref="accountID"/>
<xs:element ref="acknowledgeReceipt"/>
</xs:sequence>
. . .
3
XML Namespaces
• Mechanism for identifying different “spaces” for XML names
– That is, element or attribute names
• This is a way of identifying different language dialects, consisting of names that have specific
semantic (and processing) meanings.
• Thus <key/> in one language (might mean a security key) can be distinguished from <key/> in
another language (a database key)
• Mechanism uses a special xmlns attribute to define the namespace. The namespace is given as
a URL string
– But the URL does not reference anything in particular (there may be nothing there)
4
Mixing language dialects together
Namespaces let you do this relatively easily:
<?xml version= "1.0" encoding= "utf-8" ?>
Default „space‟
<html xmlns="http://www.w3.org/1999/xhtml1" is xhtml
xmlns:mt="http://www.w3.org/1998/mathml” >
<head>
<title> Title of XHTML Document </title>
</head><body>
<div class="myDiv">
<h1> Heading of Page </h1>
<mt:mathml>
<mt:title> ... MathML markup . . .
</mt:mathml>
<p> more html stuff goes here </p>
</div> mt: prefix indicates ‘space’
</body> mathml (a different language)
</html>
5
Presentation Outline
What is XML (basic introduction)
– Language rules, basic XML processing
Defining language dialects
– DTDs, schemas, and namespaces
XML processing
– Parsers and parser interfaces
– XML-based processing tools
XML messaging
– Why, and some issues/example
Conclusions
6
XML Software
• XML parser -- Reads in XML data, checks for syntactic (and possibly DTD/Schema) constraints,
and makes data available to an application. There are three 'generic' parser APIs
– SAX Simple API to XML (event-based)
– DOM Document Object Model (object/tree based)
– JDOM Java Document Object Model (object/tree based)
• Lots of XML parsers and interface software available (Unix, Windows, OS/390 or Z/OS, etc.)
• SAX-based parsers are fast (often as fast as you can stream data)
• DOM slower, more memory intensive (create in-memory version of entire document)
7
XML Processing: SAX
A) SAX: Simple API for XML
– http://www.megginson.com/SAX/index.html
– An event-based interface
– Parser reports events whenever it sees a tag/attribute/text node/unresolved external
entity/other
– Programmer attaches “event handlers” to handle the event
• Advantages
– Simple to use
– Very fast (not doing very much before you get the tags and data)
– Low memory footprint (doesn’t read an XML document entirely into memory)
• Disadvantages
– Not doing very much for you -- you have to do everything yourself
– Not useful if you have to dynamically modify the document once it’s in memory (since you’ll
have to do all the work to put it in memory yourself!)
8
XML Processing: DOM
B) DOM: Document Object Model
– http://www.w3.org/DOM/
– An object-based interface
– Parser generates an in-memory tree corresponding to the document
– DOM interface defines methods for accessing and modifying the tree
• Advantages
– Very useful for dynamic modification of, access to the tree
– Useful for querying (I.e. looking for data) that depends on the tree structure
[element.childNode("2").getAttributeValue("boobie")]
– Same interface for many programming languages (C++, Java, ...)
• Disadvantages
– Can be slow (needs to produce the tree), and may need lots of memory
– DOM programming interface is a bit awkward, not terribly object oriented
9
DOM Parser Processing Model
DOM parser
interface
XML data
parser application
Document “object”
desc
order text
part
partorders quantity
delivery-date
order
10
C) JDOM: Java Document Object Model XML Processing: JDOM
– http://www.jdom.org
– A Java-specific object-oriented interface
– Parser generates an in-memory tree corresponding to the document
– JDOM interface has methods for accessing and modifying the tree
• Advantages
– Very useful for dynamic modification of the tree
– Useful for querying (I.e. looking for data) that depends on the tree structure
– Much nicer Object Oriented programming interface than DOM
• Disadvantages
– Can be slow (make that tree...), and can take up lots of memory
– New, and not entirely cooked (but close)
– Only works with Java, and not (yet) part of Core Java standard
11
C) dom4j: XML framework for Java XML Processing: dom4j
– http://www.dom4j.org
– Java framework for reading, writing, navigating and editing XML.
– Provides access to SAX, DOM, JDOM interfaces, and other XML utilities (XSLT, JAXP, …)
– Can do “mixed” SAX/DOM parsing -- use SAX to one point in a document, then turn rest into a
DOM tree.
• Advantages
– Lots of goodies, all rolled into one easy-to-use Java package
– Can do “mixed” SAX/DOM parsing -- use SAX to one point in a document, then turn rest into a
DOM tree
– Apache open source license means free use (and IBM likes it!)
• Disadvantages
– Java only; may be concerns over open source nature (but IBM uses it, so it can’t be that bad!)
12
Some XML Parsers (OS/390’s)
• Xerces (C++; Apache Open Source)
http://xml.apache.org/xerces-c/index.html
• XML toolkit (Java and C+++; Commercial license)
http://www-1.ibm.com/servers/eserver/zseries/software/xml/
I believe the Java version uses XML4j, IBM’s Java Parser. The
latest version is always found at:
http://www.alphaworks.ibm.com
• XML for C++ (IBM; based on Xerces; Commercial license)
http://www.alphaworks.ibm.com/tech/xml4c
• XMLBooster (parsers for COBOL, C++ …; Commercial license; don’t know much about it; OS/390? *dunno])
http://www.xmlbooster.com/
Has free trial download,: can see if it is any good ;-)
• XML4Cobol (don’t know much about it, any COBOL85 is fine)
http://www.xml4cobol.com
13
Some parser benchmarks:
• http://www-106.ibm.com/developerworks/xml/library/x-injava/index.html (Sept 2001)
• http://www.devsphere.com/xml/benchmark/index.html (Java) (late-2000)
• Basically
14
XML Processing: XSLT
D) XSLT eXtensible Stylesheet Language -- Transformations
– http://www.w3.org/TR/xslt
– An XML language for processing XML
– Does tree transformations -- takes XML and an XSLT style sheet as input, and produces a new
XML document with a different structure
• Advantages
– Very useful for tree transformations -- much easier than DOM or SAX for this purpose
– Can be used to query a document (XSLT pulls out the part you want)
• Disadvantages
– Can be slow for large documents or stylesheets
– Can be difficult to debug stylesheets (poor error detection; much better if you use schemas)
15
XSLT processing model
• D) XSLT Processing model
schema
XSLT style sheet in XSLT
XML
parser processor
XML data in XML data out (XML)
parser
document “objects” for
schema data and style sheet
desc
order text xza
part foo
partorders quantity partorders bee
delivery-date order
order
16
Presentation Outline
• What is XML (basic introduction)
– Language rules, basic XML processing
• Defining language dialects
– DTDs, schemas, and namespaces
• XML processing
– Parsers and parser interfaces
– XML-based processing tools
• XML messaging
– Why, and some issues/example
17
•
XML Messaging
Use XML as the format for sending messages between systems
• Advantages are:
– Common syntax; self-describing (easier to parse)
– Can use common/existing transport mechanisms to “move” the XML data (HTTP, HTTPS, SMTP (email),
MQ, IIOP/(CORBA), JMS, ….)
• Requirements
– Shared understanding of dialects for transport (required registry [namespace!] ) for identifying dialects
– Shared acceptance of messaging contract
• Disadvantages
– Asynchronous transport; no guarantee of delivery, no guarantee that partner (external) shares
acceptance of contract.
– Messages will be much larger than binary (10x or more) [can compress]
18
Common messaging model
• XML over HTTP
– Use HTTP to transport XML messages
–
POST /path/to/interface.pl HTTP/1.1
Referer: http://www.foo.org/myClient.html
User-agent: db-server-olk
Accept-encoding: gzip
Accept-charset: iso-8859-1, utf-8, ucs
Content-type: application/xml; charset=utf-8
Content-length: 13221
. . .
19
Some standards for message format
• Define dialects designed to “wrap” remote invocation messages
• XML-RPC http://www.xmlrpc.com
– Very simple way of encoding function/method call name, and passed
parameters, in an XML message.
20
XML Messaging + Processing
• XML as a universal format for data exchange
Place order
SOAP interface
Application (XML/edi) using
SOAP API SOAP over HTTP Supplier
Factory
SOAP
XML/ Supplier
EDI
Transport
HTTP(S) Response Supplier
SMTP
other ... (XML/edi) using
SOAP over HTTP
21
Presentation Outline
• What is XML (basic introduction)
– Language rules, basic XML processing
• Defining language dialects
– DTDs, schemas, and namespaces
• XML processing
– Parsers and parser interfaces
– XML-based processing tools
• XML messaging
– Why, and some issues/example
• Conclusions
22
W3C rec industry std
RDF
Canonical Xpath
MathML
APIs
XSLT Xpointer XML base SMIL 1 & 2
JDOM
SVG
JAXP
XSL Xlink Infoset …...
DOM 1
XML XHTML 1.0
DOM 2 signature XHTML
XML query …. events
DOM 3
2
History!
• Structured programming
• Object-oriented programming
• Distributed computing
• Electronic Data Interchange (EDI)
• World Wide Web
• Web Services
3
Distributed Computing
• When developers create substantial applications, often it is more efficient, or even necessary, for
different task to be performed on different computers, called N-tier applications:
• A 3-tier application might have a user interface on one computer, business-logic processing
on a second and a database on a third – all interacting as the application runs.
• For distributed applications to function correctly, application components, e.g. programming
objects, executing on different computers throughout a network must be able to communicate.
E.g.: DCE, CORBA, DCOM, RMI etc.
• Interoperability:
• Ability to communicate and share data with software from different vendors and platforms
• Limited among conventional proprietary distributed computing technologies
4
Electronic Data Interchange (EDI)
• Computer-to-computer exchange of business data and documents between
companies using standard formats recognized both nationally and internationally.
• The information used in EDI is organized according to a specified format set by
both companies participating in the data exchange.
• Advantages:
• Lower operating costs
• Saves time and money
• Less Errors => More Accuracy
• No data entry, so less human error
• Increased Productivity
• More efficient personnel and faster throughput
• Faster trading cycle
• Streamlined processes for improved trading relationships
5
Web Services
• Take advantage of OOP by enabling developers to build applications from
existing software components in a modular approach:
• Transform a network (e.g. the Internet) into one library of programmatic
components available to developers to have significant productivity gains.
• Improve distributed computing interoperability by using open (non-
proprietary) standards that can enable (theoretically) any two software
components to communicate:
• Also they are easier to debug because they are text-based, rather than binary,
communication protocols
6
Web Services (contd…)
• Provide capabilities similar to those of EDI (Electronic Data Interchange), but are
simpler and less expensive to implement.
• Configured to work with EDI systems, allowing organisations to use the two
technologies together or to phase out EDI while adopting Web services.
• Unlike WWW
• Separates visual from non-visual components
• Interactions may be either through the browser or through a desktop client
(Java Swing, Python, Windows, etc.)
7
Web Services (contd…)
Intended to solve three problems:
Interoperability:
Lack of interoperability standards in distributed object messaging
DCOM apps strictly bound to Windows Operating system
RMI bound to Java programming language
Firewall traversal:
CORBA and DCOM used non-standard ports
Web Services use HTTP; most firewalls allow access though port 80 (HTTP), leading to easier and
dynamic collaboration
Complexity:
Web Services: developer-friendly service system
Use open, text-based standards, which allow components written in different languages and for
different platforms to communicate
Implemented incrementally, rather than all at once which lessens the cost and reduces the
organisational disruption from an abrupt switch in technologies
8
Web Service: Definition Revisited
9
Example: Web based purchase
Credit Service
Purchase Consolidate
Invoice
Order Results
PO Service
11
Web Service Model
12
Web Service Model (contd…)
• Roles in Web Service architecture
• Service provider
• Owner of the service
• Platform that hosts access to the service
• Service requestor
• Business that requires certain functions to be satisfied
• Application looking for and invoking an interaction with a service
• Service registry
• Searchable registry of service descriptions where service providers publish their service
descriptions
13
Web Service Model (contd…)
• Operations in a Web Service Architecture
• Publish
• Service descriptions need to be published in order for service requestor to find
them
• Find
• Service requestor retrieves a service description directly or queries the service
registry for the service required
• Bind
• Service requestor invokes or initiates an interaction with the service at runtime
14
Web Service Components
• XML – eXtensible Markup Language
• A uniform data representation and exchange mechanism.
15
Steps of Operation
1. Client queries registry to locate
UDDI 2 WSDL service.
16
Web Service Stack
17
XML
• Developed from Standard Generalized Markup Method (SGML)
• Widely supported by W3C
• Essential characteristic is the separation of content from presentation
• Designed to describe data
• XML document can optionally reference a Document Type Definition (DTD),
also called a Schema
• XML parser checks syntax
• If an XML document adheres to the structure of the schema it is valid
18
XML (contd…)
19
XML vs HTML
An HTML example: <html>
<body>
<h2>John Doe</h2>
<p>2 Backroads Lane<br>
New York<br>
045935435<br>
[email protected]<br>
</p>
</body>
</html>
20
XML vs HTML (contd…)
• This will be displayed as:
John Doe
2 Backroads Lane
New York
045935435
[email protected]
• HTML specifies how the document is to be displayed, and not what information is contained in
the document.
• Hard for machine to extract the embedded information. Relatively easy for human.
21
XML vs HTML (contd…)
• Now look at the following:
<?xml version=1.0?>
<contact>
<name>John Doe</name>
<address>2 Backroads Lane</address>
<country>New York</country>
<phone>045935435</phone>
<email>[email protected]</email>
</contact>
• In this case:
• The information contained is being marked, but not for displaying.
• Readable by both human and machines.
22
SOAP
• Simple Object Access Protocol
• Format for sending messages over Internet between programs
• XML-based
• Platform and language independent
• Simple and extensible
• Uses mainly HTTP as a transport protocol
• HTTP message contains a SOAP message as its payload section
• Stateless, one-way
• But applications can create more complex interaction patterns
23
SOAP Building Blocks Transport protocol
MIME header
• Envelope (required) – identifies XML SOAP ENVELOPE
document as SOAP message
SOAP HEADER
• Header (optional) – contains header
information
SOAP BODY
• Body (required) –call and response
information FAULT
24
SOAP Message Structure Application-specific
• Request and Response messages message vocabulary
25
SOAP Request
POST /InStock HTTP/1.1
Host: www.stock.org
Content-Type: application/soap+xml; charset=utf-8 Content-Length: 150
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle=http://www.w3.org/2001/12/soap-encoding”>
<soap:Body xmlns:m="http://www.stock.org/stock">
<m:GetStockPrice>
<m:StockName>IBM</m:StockName>
</m:GetStockPrice>
</soap:Body>
</soap:Envelope>
26
SOAP Response
HTTP/1.1 200 OK
Content-Type: application/soap; charset=utf-8
Content-Length: 126
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope"
soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding">
<soap:Body xmlns:m="http://www.stock.org/stock">
<m:GetStockPriceResponse>
<m:Price>34.5</m:Price>
</m:GetStockPriceResponse>
</soap:Body>
</soap:Envelope>
27
Why SOAP?
• Other distributed technologies failed on the Internet
• Unix RPC – requires binary-compatible Unix implementations at each endpoint
• CORBA – requires compatible ORBs
• RMI – requires Java at each endpoint
• DCOM – requires Windows at each endpoint
28
SOAP Characteristics
29
SOAP Usage Models
• RPC-like message exchange
• Request message bundles up method name and parameters
• Response message contains method return values
• However, it isn’t required by SOAP
30
SOAP Security
31
WSDL - Web Service Definition Language
• WSDL : XML vocabulary standard for describing Web services and their
capabilities
• Contract between the XML Web service and the client
• Specifies what a request message must contain and what the response
message will look like in unambiguous notation
• Defines where the service is available and what communications
protocol is used to talk to the service.
32
WSDL Document Structure
• A WSDL document is just a simple XML document.
• It defines a web service using these major elements:
• port type - The operations performed by the web service.
• message - The messages used by the web service.
• types - The data types used by the web service.
• binding - The communication protocols used by the web service.
33
A Sample WSDL
<message name="getTermRequest">
<part name="term" type="xs:string"/>
</message>
<message name="getTermResponse">
<part name="value" type="xs:string"/>
</message>
<portType name="glossaryTerms">
<operation name="getTerm">
<input message="getTermRequest"/>
<output message="getTermResponse"/>
</operation>
</portType>
34
Binding to SOAP <message name="getTermRequest">
<part name="term" type="xs:string"/>
</message>
<message name="getTermResponse">
<part name="value" type="xs:string"/>
</message>
<portType name="glossaryTerms">
<operation name="getTerm">
<input message="getTermRequest"/>
<output message="getTermResponse"/>
</operation>
</portType>
35
UDDI - Universal Description, Discovery, and Integration
• A framework to define XML-based registries
• Registries are repositories that contain documents that describe business data and
also provide search capabilities and programmatic access to remote applications
• Businesses can publish information about themselves and the services they offer
• Can be interrogated by SOAP messages and provides access to WSDL documents
describing web services in its directory
36
UDDI Roles and Operations
•Service Registry
•Provides support for publishing and locating
services
•Like telephone yellow pages
•Service Provider
•Provides e-business services
•Publishes these services through a registry
•Service requestor
•Finds required services via the Service Broker
•Binds to services via Service Provider
37
How can UDDI be Used?
38
UDDI Benefits
39
Web Services Security Architecture
WS-Secure
Conversation WS-Federation WS-Authorization
WS-Security
SOAP
40
Thank You!
41
CLOUD COMPUTING
SERVICE LEVEL AGREEMENT (SLA)
2
SLA Contents
• A set of services which the provider will deliver
• A complete, specific definition of each service
• The responsibilities of the provider and the consumer
• A set of metrics to measure whether the provider is offering the services
as guaranteed
• An auditing mechanism to monitor the services
• The remedies available to the consumer and the provider if the terms are
not satisfied
• How the SLA will change over time
3
Web Service SLA
• WS-Agreement
– XML-based language and protocol for negotiating, establishing, and managing
service agreements at runtime
– Specify the nature of agreement template
– Facilitates in discovering compatible providers
– Interaction : request-response
– SLA violation : dynamically managed and verified
• WSLA (Web Service Level Agreement Framework)
– Formal XML-schema based language to express SLA and a runtime interpreter
– Measure and monitor QoS parameters and report violations
– Lack of formal definitions for semantics of metrics
4
Difference between Cloud SLA and Web
Service SLA
• QoS Parameters :
– Traditional Web Service : response time, SLA violation rate for reliability, availability, cost of
service, etc.
– Cloud computing : QoS related to security, privacy, trust, management, etc.
• Automation :
– Traditional Web Service : SLA negotiation, provisioning, service delivery, monitoring are not
automated.
– Cloud computing : SLA automation is required for highly dynamic and scalable service
consumption
• Resource Allocation :
– Traditional Web Service : UDDI (Universal Description Discovery and Integration) for
advertising and discovering between web services
– Cloud computing : resources are allocated and distributed globally without any central
directory
5
Types of SLA
6
Service Level Objectives (SLOs)
• Objectively measurable conditions for the service
• Encompasses multiple QoS parameters viz. availability,
serviceability, billing, penalties, throughput, response time, or
quality
• Example :
– “Availability of a service X is 99.9%”
– “Response time of a database query Q is between 3 to 5 seconds”
– “Throughput of a server S at peak load time is 0.875”
7
Service Level Management
8
Considerations for SLA
• Business Level Objectives: Consumers should know why they are using cloud
services before they decide how to use cloud computing.
• Responsibilities of the Provider and Consumer: The balance of
responsibilities between providers and consumers will vary according to the
type of service.
• Business Continuity and Disaster Recovery: Consumers should ensure their
cloud providers have adequate protection in case of a disaster.
• System Redundancy: Many cloud providers deliver their services via massively
redundant systems. Those systems are designed so that even if hard drives or
network connections or servers fail, consumers will not experience any
outages.
9
Considerations for SLA (contd…)
• Maintenance: Maintenance of cloud infrastructure affects any kind of cloud offerings
(applicable to both software and hardware)
• Location of Data: If a cloud service provider promises to enforce data location regulations,
the consumer must be able to audit the provider to prove that regulations are being
followed.
• Seizure of Data: If law enforcement targets the data and applications associated with a
particular consumer, the multi-tenant nature of cloud computing makes it likely that other
consumers will be affected. Therefore, the consumer should consider using a third-party to
keep backups of their data
• Failure of the Provider: Consumers should consider the financial health of their provider and
make contingency plans. The provider’s policies of handling data and applications of a
consumer whose account is delinquent or under dispute are to be considered.
• Jurisdiction: Consumers should understand the laws that apply to any cloud providers they
consider.
10
SLA Requirements
• Security: Cloud consumer must understand the controls and federation patterns
necessary to meet the security requirements. Providers must understand what
they should deliver to enable the appropriate controls and federation patterns.
• Data Encryption: Details of encryption and access control policies.
• Privacy: Isolation of customer data in a multi-tenant environment.
• Data Retention and Deletion: Some cloud providers have legal requirements of
retaining data even of it has been deleted by the consumer. Hence, they must be
able to prove their compliance with these policies.
• Hardware Erasure and Destruction: Provider requires to zero out the memory if a
consumer powers off the VM or even zero out the platters of a disk, if it is to be
disposed or recycled.
11
SLA Requirements (Contd…)
• Regulatory Compliance: If regulations are enforced on data and applications, the providers should
be able to prove compliance.
• Transparency: For critical data and applications, providers must be proactive in notifying consumers
when the terms of the SLA are breached.
• Certification: The provider should be responsible in proving the certification of any kind of data or
applications and keeping its up-to date.
• Monitoring: To eliminate the conflict of interest between the provider and the consumer, a neural
third-party organization is the best solution to monitor performance.
• Auditability: As the consumers are liable to any breaches that occur, it is vital that they should be
able to audit provider’s systems and procedures. An SLA should make it clear how and when those
audits take place. Because audits are disruptive and expensive, the provider will most likely place
limits and charges on them.
12
Key Performance Indicators (KPIs)
• Low-level resource metrics
• Multiple KPIs are composed, aggregated, or
converted to for high-level SLOs.
• Example :
– downtime, uptime, inbytes, outbytes, packet size, etc.
• Possible mapping :
– Availability (A) = 1 – (downtime/uptime)
13
Industry-defined KPIs
• Monitoring:
– Natural questions:
• “who should monitor the performance of the provider?”
• “does the consumer meet its responsibilities?”
– Solution: neutral third-party organization to perform monitoring
– Eliminates conflicts of interest if:
• Provider reports outage at its sole discretion
• Consumer is responsible for an outage
• Auditability:
– Consumer requirement:
• Is the provider adhering to legal regulations or industry-standard
• SLA should make it clear how and when to conduct audits
14
Metrics for Monitoring and Auditing
• Throughput – How quickly the service responds
• Availability – Represented as a percentage of uptime for a service in a given
observation period.
• Reliability – How often the service is available
• Load balancing – When elasticity kicks in (new VMs are booted or terminated, for
example)
• Durability – How likely the data is to be lost
• Elasticity – The ability for a given resource to grow infinitely, with limits (the
maximum amount of storage or bandwidth, for example) clearly stated
• Linearity – How a system performs as the load increases
15
Metrics for Monitoring and Auditing (Contd…)
• Agility – How quickly the provider responds as the consumer's resource load scales up and
down
• Automation – What percentage of requests to the provider are handled without any human
interaction
• Customer service response times – How quickly the provider responds to a service request.
This refers to the human interactions required when something goes wrong with the on-
demand, self-service aspects of the cloud.
• Service-level violation rate – Expressed as the mean rate of SLA violation due to
infringements of the agreed warranty levels.
• Transaction time – Time that has elapsed from when a service is invoked till the completion
of the transaction, including the delays.
• Resolution time – Time period between detection of a service problem and its resolution.
16
SLA Requirements w.r.t. Cloud Delivery Models
17
Example Cloud SLAs
Cloud Service Type of Service Level Agreement Guarantees
Provider Delivery
Model
Amazon EC2 IaaS Availability (99.95%) with the following definitions : Service Year
: 365 days of the year, Annual Percentage Uptime, Region
Unavailability : no external connectivity during a five minute
period, Eligible Credit Period, Service Credit
S3 Storage-as-a- Availability (99.9%) with the following definitions: Error Rate,
Service Monthly Uptime Percentage, Service Credit
SimpleDB Database-as- No specific SLA is defined and the agreement does not guarantee
a-Service availability
Salesforce CRM PaaS No SLA guarantees for the service provided
Google Google App PaaS Availability (99.9%) with the following definitions : Error Rate,
Engine Error Request, Monthly Uptime Percentage, Scheduled
Maintenance, Service Credits, and SLA exclusions
18
Example Cloud SLAs (contd…)
Cloud Service Type of Service Level Agreement Guarantees
Provider Delivery
Model
Microsoft Microsoft IaaS/PaaS Availability (99.95%) with the following definitions : Monthly
Azure Connectivity Uptime Service Level, Monthly Role Instance Uptime
Compute Service Level, Service Credits, and SLA exclusions
Microsoft Storage-as-a- Availability (99.9%) with the following definitions: Error Rate,
Azure Service Monthly Uptime Percentage, Total Storage Transactions, Failed
Storage Storage Transactions, Service Credit, and SLA exclusions
Zoho suite Zoho mail, SaaS Allows the user to customize the service level agreement
Zoho CRM, guarantees based on : Resolution Time, Business Hours & Support
Zoho books Plans, and Escalation
19
Example Cloud SLAs (contd…)
Cloud Service Type of Cloud Delivery Service Level Agreement Guarantees
Provider Model
Rackspace Cloud IaaS Availability regarding the following: Internal Network
Server (100%), Data Center Infrastructure (100%), Load
balancers (99.9%)
Performance related to service degradation: Server
migration, notified 24 hours in advance, and is
completed in 3 hours (maximum)
Recovery Time: In case of failure, guarantee of
restoration/recovery in 1 hour after the problem is
identified.
Terremark vCloud IaaS Monthly Uptime Percentage (100%) with the following
Express definitions: Service Credit, Credit Request and Payment
Procedure, and SLA exclusions
20
Example Cloud SLAs (contd…)
Cloud Service Type of Cloud Service Level Agreement Guarantees
Provider Delivery Model
Nirvanix Public, Private, Storage-as-a-Service Monthly Availability Percentage (99.9%) with the
Hybrid Cloud following definitions: Service Availability, Service
Storage Credits, Data Replication Policy, Credit Request
Procedure, and SLA Exclusions
21
Limitations
• Service measurement
– Restricted to uptime percentage
– Measured by taking the mean of service availability observed over a specific
period of time
– Ignores other parameters like stability, capacity, etc.
• Biasness towards vendors
– Measurement of parameters are mostly established according to vendor’s
advantage
• Lack of active monitoring on customer’s side
– Customers are given access to some ticketing systems and are responsible for
monitoring the outages.
– Providers do not provide any access to active data streams or audit trails, nor do
they report any outages.
22
Limitations (contd…)
• Gap between QoS hype and SLA offerings in reality
• QoS in the areas of governance, reliability, availability, security, and
scalability are not well addressed.
• No formal ways of verifying if the SLA guarantees are complying or
not.
• Proper SLA are good for both provider as well as the customer
– Provider’s perspective : Improve upon Cloud infrastructure, fair
competition in Cloud market place
– Customer’s perspective : Trust relationship with the provider, choosing
appropriate provider for moving respective businesses to Cloud
23
Expected SLA Parameters
• Infrastructure-as-a-Service (IaaS):
– CPU capacity, cache memory size, boot time of standard images,
storage, scale up (maximum number of VMs for each user), scale
down (minimum number of VMs for each user), On demand
availability, scale uptime, scale downtime, auto scaling, maximum
number of VMs configured on physical servers, availability, cost
related to geographic locations, and response time
• Platform-as-a-Service (PaaS):
– Integration, scalability, billing, environment of deployment (licenses,
patches, versions, upgrade capability, federation, etc.), servers,
browsers, number of developers
24
Expected SLA Parameters (contd…)
• Software-as-a-Service (SaaS):
– Reliability, usability, scalability, availability, customizability, Response
time
• Storage-as-a-Service :
– Geographic location, scalability, storage space, storage billing, security,
privacy, backup, fault tolerance/resilience, recovery, system
throughput, transferring bandwidth, data life cycle management
25
26
Cloud Computing : Economics
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
Cloud Properties: Economic Viewpoint
• Common Infrastructure
– pooled, standardized resources, with benefits generated by statistical
multiplexing.
• Location-independence
– ubiquitous availability meeting performance requirements, with benefits
deriving from latency reduction and user experience enhancement.
• Online connectivity
– an enabler of other attributes ensuring service access. Costs and
performance impacts of network architectures can be quantified using
traditional methods.
9/3/2017 2
Cloud Properties: Economic Viewpoint
Contd…
• Utility pricing
– usage-sensitive or pay-per-use pricing, with benefits applying in
environments with variable demand levels.
• on-Demand Resources
– scalable, elastic resources provisioned and de-provisioned without
delay or costs associated with change.
9/3/2017 3
Value of Common Infrastructure
• Economies of scale
– Reduced overhead costs
– Buyer power through volume purchasing
• Statistics of Scale
– For infrastructure built to peak requirements:
• Multiplexing demand higher utilization
• Lower cost per delivered resource than unconsolidated workloads
– For infrastructure built to less than peak:
• Multiplexing demand reduce the unserved demand
• Lower loss of revenue or a Service-Level agreement violation
payout.
9/3/2017 4
A Useful Measure of “Smoothness”
• The coefficient of variation CV
– ≠ the variance σ2 nor the correlation coefficient
• Ratio of the standard deviation σ to the absolute value of the mean |μ|
• “Smoother” curves:
– large mean for a given standard deviation
– or smaller standard deviation for a given mean
• Importance of smoothness:
– a facility with fixed assets servicing highly variable demand will achieve lower
utilization than a similar one servicing relatively smooth demand.
• Multiplexing demand from multiple sources may reduce the coefficient
of variation CV
9/3/2017 5
Coefficient of variation CV
• X1, X2, …, Xn independent random variables for demand
– Identical standard variation σ and mean µ
• Aggregated demand
– Mean sum of means: n. µ
– Variance sum of variances: n. σ2
𝑛.σ σ 1
– Coefficient of variance = = Cv
n. µ 𝑛.µ 𝑛
1
• Adding n independent demands reduces the Cv by
𝑛
– Penalty of insufficient/excess resources grows smaller
– Aggregating 100 workloads bring the penalty to 10%
9/3/2017 6
But What about Workloads?
• Negative correlation demands
X and 1-X Sum is random variable 1
Appropriate selection of customer segments
• Perfectly correlated demands
Aggregated demand : n.X, varianceofsum:n2σ2(X)
Mean: n.µ, standard deviation: n.σ(X)
Coefficient of Variance remains constant
• Simultaneous peaks
9/3/2017 7
Common Infrastructure in Real World
• Correlated demands:
– Private, mid-size and large-size providers can experience similar
statistics of scale
• Independent demands:
– Midsize providers can achieve similar statistical economies to an infinitely
large provider
• Available data on economy of scale for large providers is
mixed
– use the same COTS computers and components
– Locating near cheap power supplies
– Early entrant automation tools 3rd parties take care of it
9/3/2017 8
Value of Location Independence
• We used to go to the computers, but applications, services and contents now come
to us!
– Through networks: Wired, wireless, satellite, etc.
• But what about latency?
– Human response latency: 10s to 100s milliseconds
– Latency is correlated with:
• Distance (Strongly)
• Routing algorithms of routers and switches (second order effects)
– Speed of light in fiber: only 124 miles per millisecond
– If the Google word suggestion took 2 seconds
– VOIP with latency of 200ms or more
9/3/2017 9
Value of Location Independence
Contd…
9/3/2017 10
Value of Utility Pricing
• As mentioned before, economy of scale might not be very effective
• But cloud services don’t need to be cheaper to be economical!
• Consider a car
– Buy or lease for INR 10,000/- per day
– Rent a car for INR 45,000/- a day
– If you need a car for 2 days in a trip, buying would be much more costly
than renting
• It depends on the demand
9/3/2017 11
Utility Pricing in Detail
𝑇
D(t) demand for resources 0<t<T CT= 0
𝑈 ⨯ 𝐵 ⨯ 𝐷 𝑡 𝑑𝑡 = 𝐴 ⨯ 𝑈 ⨯ B ⨯ T
P max (D(t)) : Peak Demand BT= P ⨯ B ⨯ T
A Avg (D(t)) : Average Demand Because the baseline should
B Baseline (owned) unit cost handle peak demand
[BT : Total Baseline Cost]
When is cloud cheaper than owning?
C Cloud unit cost
[CT : Total Cloud Cost] CT< BT A⨯ U⨯ B⨯T<P⨯B⨯T
𝑃
U (=C/B) Utility Premium U<
[For rental car example, U=4.5] 𝐴
When utility premium is less than ratio
of peak demand to Average demand
9/3/2017 12
Utility Pricing in Real World
• In practice demands are often highly spiky
– News stories, marketing promotions, product launches, Internet flash floods
(Slashdot effect), tax season, Christmas shopping, processing a drone
footage for a 1 week border skirmish, etc.
• Often a hybrid model is the best
– You own a car for daily commute, and rent a car when traveling or when you
need a van to move
– Key factor is again the ratio of peak to average demand
– But we should also consider other costs
• Network cost (both fixed costs and usage costs)
• Interoperability overhead
• Consider Reliability, accessibility
9/3/2017 13
Value of on-Demand Services
• Simple Problem: When owning your resources, you will pay a penalty
whenever your resources do not match the instantaneous demand
I. Either pay for unused resources, or suffer the penalty of missing service delivery
D(t) – Instantaneous Demand at time t
R(t) – Resources at time t
If demand is flat, penalty = 0
Penalty Cost α |D(t) – R(t)|dt If demand is linear periodic provisioning
is acceptable
9/3/2017 14
Penalty Costs for Exponential Demand
• Penalty cost ∝ |𝐷 𝑡 − 𝑅 𝑡 |𝑑𝑡
• If demand is exponential (D(t)=et), any
fixed provisioning interval (tp) according
to the current demands will fall
exponentially behind
• R(t) = 𝑒 𝑡−𝑡𝑝
• D(t) – R(t) = 𝑒 𝑡 − 𝑒 𝑡−𝑡𝑝 = 𝑒 𝑡 1 − 𝑒 𝑡𝑝 =
𝑘1 𝑒 𝑡
• Penalty cost ∝c.k1et
9/3/2017 15
Coefficient of Variation - Cv
• A statistical measure of the dispersion of data points in a data series around the
mean.
• The coefficient of variation represents the ratio of the standard deviation to the
mean, and it is a useful statistic for comparing the degree of variation from one
data series to another, even if the means are drastically different from each
other
• In the investing world, the coefficient of variation allows you to determine how
much volatility (risk) you are assuming in comparison to the amount of return
you can expect from your investment. In simple language, the lower the ratio
of standard deviation to mean return, the better your risk-return tradeoff.
9/3/2017 16
Assignment 1
Consider the peak computing demand for an organization is 120 units. The
demand as a function of time can be expressed as:
50 sin 𝑡 , 0≤𝑡<𝜋 2
𝐷 𝑡 = 𝜋 ≤𝑡<𝜋
20 sin 𝑡 , 2
17
Assignment 1 (contd…)
The cost to provision unit cloud resource for unit time is 0.9 units.
Calculate the penalty and draw inference.
[Assume the delay in provisioning is 𝜋 12 time units and minimum
demand is 0]
1
Introduction
• Relational database
– Default data storage and retrieval mechanism since 80s
– Efficient in: transaction processing
– Example: System R, Ingres, etc.
– Replaced hierarchical and network databases
• For scalable web search service:
– Google File System (GFS)
• Massively parallel and fault tolerant distributed file system
– BigTable
• Organizes data
• Similar to column-oriented databases (e.g. Vertica)
– MapReduce
• Parallel programming paradigm
9/3/2017 2
Introduction Contd…
• Suitable for:
– Large volume massively parallel text processing
– Enterprise analytics
• Similar to BigTable data model are:
– Google App Engine’s Datastore
– Amazon’s SimpleDB
9/3/2017 3
Relational Databases
• Users/application programs interact with an RDBMS through SQL
• RDBM parser:
– Transforms queries into memory and disk-level operations
– Optimizes execution time
• Disk-space management layer:
– Stores data records on pages of contiguous memory blocks
– Pages are fetched from disk into memory as requested using pre-fetching and
page replacement policies
9/3/2017 4
Relational Databases Contd…
• Database file system layer:
– Independent of OS file system
– Reason:
• To have full control on retaining or releasing a page in memory
• Files used by the DB may span multiple disks to handle large storage
– Uses parallel I/O systems, viz. RAID disk arrays or multi-
processor clusters
9/3/2017 5
Data Storage Techniques
• Row-oriented storage
– Optimal for write-oriented operations viz. transaction processing applications
– Relational records: stored on contiguous disk pages
– Accessed through indexes (primary index) on specified columns
– Example: B+- tree like storage
• Column-oriented storage
– Efficient for data-warehouse workloads
• Aggregation of measure columns need to be performed based on values from dimension
columns
• Projection of a table is stored as sorted by dimension values
• Require multiple “join indexes”
– If different projections are to be indexed in sorted order
9/3/2017 6
Data Storage Techniques Contd…
9/3/2017 7
Parallel Database Architectures
• Shared memory
– Suitable for servers with multiple CPUs
– Memory address space is shared and managed by a symmetric multi-processing (SMP) operating system
– SMP:
• Schedules processes in parallel exploiting all the processors
• Shared nothing
– Cluster of independent servers each with its own disk space
– Connected by a network
• Shared disk
– Hybrid architecture
– Independent server clusters share storage through high-speed network storage viz. NAS (network attached
storage) or SAN (storage area network)
– Clusters are connected to storage via: standard Ethernet, or faster Fiber Channel or Infiniband connections
9/3/2017 8
Parallel Database Architectures contd…
9/3/2017 9
Advantages of Parallel DB over Relational DB
• Efficient execution of SQL queries by exploiting multiple processors
• For shared nothing architecture:
– Tables are partitioned and distributed across multiple processing nodes
– SQL optimizer handles distributed joins
• Distributed two-phase commit locking for transaction isolation between processors
• Fault tolerant
– System failures handled by transferring control to “stand-by” system [for transaction
processing]
– Restoring computations [for data warehousing applications]
9/3/2017 10
Advantages of Parallel DB over Relational DB
9/3/2017 11
Cloud File Systems
• Google File System (GFS)
– Designed to manage relatively large files using a very large distributed cluster of
commodity servers connected by a high-speed network
– Handles:
• Failures even during reading or writing of individual files
• Fault tolerant: a necessity
– p(system failure) = 1-(1-p(component failure))N 1 (for large N)
• Support parallel reads, writes and appends by multiple simultaneous client programs
• Hadoop Distributed File System (HDFS)
– Open source implementation of GFS architecture
– Available on Amazon EC2 cloud platform
9/3/2017 12
GFS Architecture
9/3/2017 13
GFS Architecture Contd…
9/3/2017 14
Read Operation in GFS
• Client program sends the full path and offset of a file to the Master (GFS)
or Name Node (HDFS)
• Master replies with meta-data for one of replicas of the chunk where this
data is found.
• Client caches the meta-data for faster access
• It reads data from the designated chunk server
9/3/2017 15
Write/Append Operation in GFS
– Client program sends the full path of a file to the Master (GFS) or Name Node (HDFS)
– Master replies with meta-data for all of replicas of the chunk where this data is found.
– Client send data to be appended to all chunk servers
– Chunk server acknowledge the receipt of this data
– Master designates one of these chunk servers as primary
– Primary chunk server appends its copy of data into the chunk by choosing an offset
• Appending can also be done beyond EOF to account for multiple simultaneous
writers
– Sends the offset to each replica
– If all replicas do not succeed in writing at the designated offset, the primary retries
9/3/2017 16
Fault Tolerance in GFS
• Master maintains regular communication with chunk servers
– Heartbeat messages
• In case of failures:
– Chunk server’s meta-data is updated to reflect failure
– For failure of primary chunk server, the master assigns a new primary
– Clients occasionally will try to this failed chunk server
• Update their meta-data from master and retry
9/3/2017 17
BigTable
• Distributed structured storage system built on GFS
• Sparse, persistent, multi-dimensional sorted map (key-value pairs)
• Data is accessed by:
– Row key
– Column key
– Timestamp
9/3/2017 18
BigTable Contd…
• Each column can store arbitrary name-value pairs in the form: column-
family : label
• Set of possible column-families for a table is fixed when it is created
• Labels within a column family can be created dynamically and at any time
• Each BigTable cell (row, column) can store multiple versions of the data in
decreasing order of timestamp
– As data in each column is stored together, they can be accessed efficiently
9/3/2017 19
BigTable Storage
9/3/2017 20
BigTable Storage Contd…
• Each table is split into different row ranges, called tablets
• Each tablet is managed by a tablet server:
– Stores each column family for a given row range in a separate distributed file, called SSTable
• A single meta-data table is managed by a Meta-data server
– Locates the tablets of any user table in response to a read/write request
• The meta-data itself can be very large:
– Meta-data table can be similarly split into multiple tablets
– A root tablet points to other meta-data tablets
• Supports large parallel reads and inserts even simultaneously on the same table
• Insertions done in sorted fashion, and requires more work can simple append
9/3/2017 21
Dynamo
• Developed by Amazon
• Supports large volume of concurrent updates, each of which could be small
in size
– Different from BigTable: supports bulk reads and writes
• Data model for Dynamo:
– Simple <key, value> pair
– Well-suited for Web-based e-commerce applications
– Not dependent on any underlying distributed file system (for e.g. GFS/HDFS) for:
• Failure handling
– Data replication
– Forwarding write requests to other replicas if the intended one is down
• Conflict resolution
9/3/2017 22
Dynamo Architecture
9/3/2017 23
Dynamo Architecture Contd…
• Objects: <Key, Value> pairs with arbitrary arrays of bytes
• MD5: generates a 128-bit hash value
• Range of this hash function is mapped to a set of virtual nodes arranged in a ring
– Each key gets mapped to one virtual node
• The object is replicated at a primary virtual node as well as (N – 1) additional
virtual nodes
– N: number of physical nodes
• Each physical node (server) manages a number of virtual nodes at distributed
positions on the ring
9/3/2017 24
Dynamo Architecture Contd…
• Load balancing for:
– Transient failures
– Network partition
• Write request on an object:
– Executed at one of its virtual nodes
– Forwards the request to all nodes which have the replicas of the object
– Quorum protocol: maintains eventual consistency of the replicas when a large
number of concurrent reads & writes take place
9/3/2017 25
Dynamo Architecture Contd…
• Distributed object versioning
– Write creates a new version of an object with its local timestamp incremented
– Timestamp:
• Captures history of updates
• Versions that are superseded by later versions (having larger vector timestamp) are
discarded
• If multiple write operations on same object occurs at the same time, all versions will be
maintained and returned to read requests
• If conflict occurs:
– Resolution done by application-independent logic
9/3/2017 26
Dynamo Architecture Contd…
• Quorum consistent:
– Read operation accesses R replicas
– Write operation access W replicas
• If (R + W) > N : system is said to be quorum consistent
– Overheads:
• For efficient write: larger number of replicas to be read
• For efficient read: larger number of replicas to be written into
• Dynamo:
– Implemented by different storage engines at node level: Berkley DB (used
by Amazon), MySQL, etc.
9/3/2017 27
Datastore
• Google and Amazon offer simple transactional <Key, Value> pair database stores
– Google App Engine’s Datastore
– Amazon’ SimpleDB
• All entities (objects) in Datastore reside in one BigTable table
– Does not exploit column-oriented storage
• Entities table: store data as one column family
9/3/2017 28
Datastore contd…
• Multiple index tables are used to support efficient queries
• BigTable:
– Horizontally partitioned (also called sharded) across disks
– Sorted lexicographically by the key values
• Beside lexicographic sorting Datastore enables:
– Efficient execution of prefix and range queries on key values
• Entities are ‘grouped’ for transaction purpose
– Keys are lexicographic by group ancestry
• Entities in the same group: stored close together on disk
• Index tables: support a variety of queries
– Uses values of entity attributes as keys
9/3/2017 29
Datastore Contd…
• Automatically created indexes:
– Single-Property indexes
• Supports efficient lookup of the records with WHERE clause
– ‘Kind’ indexes
• Supports efficient lookup of queries of form SELECT ALL
• Configurable indexes
– Composite index:
• Retrieves more complex queries
• Query execution
– Indexes with highest selectivity is chosen
9/3/2017 30
31
Cloud Computing :
Introduction to MapReduce
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
Introduction
• MapReduce: programming model developed at Google
• Objective:
– Implement large scale search
– Text processing on massively scalable web data stored using BigTable and GFS distributed file
system
• Designed for processing and generating large volumes of data via massively parallel
computations, utilizing tens of thousands of processors at a time
• Fault tolerant: ensure progress of computation even if processors and networks fail
• Example:
– Hadoop: open source implementation of MapReduce (developed at Yahoo!)
– Available on pre-packaged AMIs on Amazon EC2 cloud platform
9/3/2017 2
Parallel Computing
• Different models of parallel computing
– Nature and evolution of multiprocessor computer architecture
– Shared-memory model
• Assumes that any processor can access any memory location
• Unequal latency
– Distributed-memory model
• Each processor can access only its own memory and communicates with other processors using message passing
• Parallel computing:
– Developed for compute intensive scientific tasks
– Later found application in the database arena
• Shared-memory
• Shared-disk
• Shared-nothing
9/3/2017 3
Parallel Database Architectures
9/3/2017 4
Parallel Database Architectures Contd…
• Shared memory
– Suitable for servers with multiple CPUs
– Memory address space is shared and managed by a symmetric multi-processing (SMP) operating system
– SMP:
• Schedules processes in parallel exploiting all the processors
• Shared nothing
– Cluster of independent servers each with its own disk space
– Connected by a network
• Shared disk
– Hybrid architecture
– Independent server clusters share storage through high-speed network storage viz. NAS (network
attached storage) or SAN (storage area network)
– Clusters are connected to storage via: standard Ethernet, or faster Fiber Channel or Infiniband
connections
9/3/2017 5
Parallel Efficiency
• If a task takes time T in uniprocessor system, it should take T/p if executed on p
processors
• Inefficiencies introduced in distributed computation due to:
– Need for synchronization among processors
– Overheads of message communication between processors
– Imbalance in the distribution of work to processors
• Parallel efficiency of an algorithm is defined as:
9/3/2017 6
Illustration
• Problem: Consider a very large collection of documents, say web pages crawled
from the entire Internet. The problem is to determine the frequency (i.e., total
number of occurrences) of each word in this collection. Thus, if there are n
documents and m distinct words, we wish to determine m frequencies, one for
each word.
• Two approaches:
– Let each processor compute the frequencies for m/p words
– Let each processor compute the frequencies of m words across n/p documents, followed by all the
processors summing their results
• Parallel computing is implemented as a distributed-memory model with a shared
disk, so that each processor is able to access any document from disk in parallel
with no contention
9/3/2017 7
Illustration Contd…
• Time to read each word from the document = Time to send the word to
another processor via inter-process communication = c
• Time to add to a running total of frequencies -> negligible
• Each word occurs f times in a document (on average)
• Time for computing all m frequencies with a single processor = n × m × f × c
• First approach:
– Each processor reads at most n × m/p × f times
– Parallel efficiency is calculated as:
– Efficiency falls with increasing p
– Not scalable
9/3/2017 8
Illustration Contd…
• Second approach
– Number of reads performed by each processor = n/p × m × f
– Time taken to read = n/p × m × f × c
– Time taken to write partial frequencies of m-words in parallel to disk = c × m
– Time taken to communicate partial frequencies to (p - 1) processors and then
locally adding p sub-vectors to generate 1/p of final m-vector of frequencies =
p × (m/p) × c
– Parallel efficiency is computed as:
9/3/2017 9
Illustration Contd…
• Since p << nf, efficiency of second approach is higher than that of first
• In fist approach, each processor is reading many words that it need not
read, resulting in wasted work
• In the second approach every read is useful in that it results in a
computation that contributes to the final answer
• Scalable
– Efficiency remains constant as both n and p increases proportionally
– Efficiency tends to 1 for fixed p and gradually increased n
9/3/2017 10
MapReduce Model
• Parallel programming abstraction
• Used by many different parallel applications which carry out large-scale
computation involving thousands of processors
• Leverages a common underlying fault-tolerant implementation
• Two phases of MapReduce:
– Map operation
– Reduce operation
• A configurable number of M ‘mapper’ processors and R ‘reducer’ processors are
assigned to work on the problem
• Computation is coordinated by a single master process
9/3/2017 11
MapReduce Model Contd…
• Map phase:
– Each mapper reads approximately 1/M of the input from the global file
system, using locations given by the master
– Map operation consists of transforming one set of key-value pairs to
another:
9/3/2017 12
MapReduce Model
Contd…
• Reduce phase:
– The master informs the reducers where the partial computations have been stored
on local files of respective mappers
– Reducers make remote procedure call requests to the mappers to fetch the files
– Each reducer groups the results of the map step using the same key and performs a
function f on the list of values that correspond to these key value:
9/3/2017 13
MapReduce: Example
• 3 mappers; 2 reducers
• Map function:
• Reduce function:
9/3/2017 14
MapReduce: Fault Tolerance
• Heartbeat communication
– Updates are exchanged regarding the status of tasks assigned to workers
– Communication exists, but no progress: master duplicate those tasks and assigns to
processors who have already completed
• If a mapper fails, the master reassigns the key-range designated to it to another
working node for re-execution
– Re-execution is required as the partial computations are written into local files,
rather than GFS file system
• If a reducer fails, only the remaining tasks are reassigned to another node, since
the completed tasks are already written back into GFS
9/3/2017 15
MapReduce: Efficiency
• General computation task on a volume of data D
• Takes wD time on a uniprocessor (time to read data from disk +
performing computation + time to write back to disk)
• Time to read/write one word from/to disk = c
• Now, the computational task is decomposed into map and reduce stages
as follows:
– Map stage:
• Mapping time = cmD
• Data produced as output = σD
– Reduce stage:
• Reducing time = crσD
• Data produced as output = σµD
9/3/2017 16
MapReduce: Efficiency Contd…
• Considering no overheads in decomposing a task into a map and a reduce stages, we have
the following relation:
𝒘𝑫 = 𝒄𝑫 + 𝒄𝒎𝑫 + 𝒄𝒓𝝈𝑫 + 𝒄𝝈µ𝑫
• Now, we use P processors that serve as both mapper and reducers in respective phases to
solve the problem
• Additional overhead:
– Each mapper writes to its local disk followed by each reducer remotely reading from the local disk of
each mapper
• For analysis purpose: time to read a word locally or remotely is same
𝒘𝑫
• Time to read data from disk by each mapper =
𝑷
𝝈𝑫
• Data produced by each mapper =
𝑷
9/3/2017 17
MapReduce: Efficiency
𝒄𝝈𝑫
Contd…
• Time required to write into local disk =
𝑷
𝝈𝑫
• Data read by each reducer from its partition in each of P mappers =
𝑷𝟐
• The entire exchange can be executed in P steps, with each reducer r reading
from mapper r + i mod r in step i
• Transfer time from mapper local disk to GFS for each reducer = 𝒄𝝈𝑫
𝑷𝟐
⨯ P =
𝒄𝝈𝑫
𝑷
• Total overhead in parallel implementation due to intermediate disk reads and
𝒘𝑫 𝝈𝑫
writes = ( + 𝟐𝒄 )
𝑷 𝑷
• Parallel efficiency of the MapReduce implementation:
𝒘𝑫 𝟏
𝜺𝑴𝑹 = 𝒘𝑫 𝝈𝑫 = 𝟐𝒄
𝑷( 𝑷 +𝟐𝒄 𝑷 ) 𝟏+ 𝒘 𝝈
9/3/2017 18
MapReduce: Applications
• Indexing a large collection of documents
– Important aspect in web search as well as handling structured data
– The map task consists of emitting a word-document/record-id pair for
each word: 𝒅𝒌, 𝒘𝟏 … 𝒘𝒏 → [ 𝒘𝒊, 𝒅𝒌 ]
– The reduce step groups the pairs by word and creates an index entry for
each word: 𝒘𝒊, 𝒅𝒌 → (𝒘𝒊, 𝒅𝒊𝟏 … 𝒅𝒊𝒎 )
9/3/2017 19
20
CLOUD COMPUTING
OPENSTACK:
2
Job Trend for Openstack
3
OpenStack Capability
4
OpenStack Capability
▪ Virtual Machine (VMs) on demand
▪ Provisioning
▪ Snapshotting
▪ Network
▪ Storage for VMs and arbitrary files
▪ Multi-tenancy
▪ Quotas for different project, users
▪ User can be associated with multiple projects
5
OpenStack History
6
OpenStack Major Components
▪ Service - Compute
▪ Project - Nova
7
OpenStack Major Components
▪ Service - Networking
▪ Project - Neutron
8
OpenStack Major Components
▪ Service - Object storage
▪ Project - Swift
• Stores and retrieves arbitrary unstructured data objects via a RESTFul, HTTP
based API.
• It is highly fault tolerant with its data replication and scale-out architecture. Its
implementation is not like a file server with mountable directories.
• In this case, it writes objects and files to multiple drives, ensuring the data is
replicated across a server cluster.
9
OpenStack Major Components
10
OpenStack Major Components
▪ Service - Identity
▪ Project - Keystone
11
OpenStack Major Components
12
OpenStack Major Components
▪ Service - Telemetry
▪ Project - Ceilometer
13
OpenStack Major Components
▪ Service - Dashboard
▪ Project - Horizon
14
Architecture of Openstack
15
Openstack Work Flow 1. User logs in to UI Specifies 2. Horizon sends HTTP
4. Keystone sends temporary token back to VM params: name,flavor,keys,etc. request to Keystone. Auth
Horizon via HTTP. Horizon sends POST and hits "Create" button info is specified in HTTP
request to Nova API(signed with given headers.
token).
3. Keystone sends
temporary token back to
Horizon via HTTP.
16
Auth Token Usage
17
Provisioning Flow
▪ Nova API makes rpc.cast to Scheduler. It publishes a short message to scheduler queue with VM
info.
▪ Scheduler picks up the message from MQ.
▪ Scheduler fetches information about the whole cluster from database, filters, selects compute node
and updates DB with its ID
▪ Scheduler publishes message to the compute queue (based on host ID) to trigger VM provisioning
▪ Nova Compute gets message from MQ
▪ Nova Compute makes rpc.call to Nova Conductor for information on VM from DB
▪ Nova Compute makes a call to Neutron API to provision network for the instance
▪ Neutron configures IP, gateway, DNS name, L2 connectivity etc.
▪ It is assumed a volume is already created. Nova Compute contacts Cinder to get volume data. Can
also attach volumes after VM is built.
18
Nova Compute Driver
19
Nova scheduler filtering
20
Neutron Architecture
21
Glance Architecture
22
Cinder Architecture
23
Keystone Architecture
24
OpenStack Storage Concepts
• Ephemeral storage:
• Persists until VM is terminated
• Accessible from within VM as local file system
• Used to run operating system and/or scratch space
• Managed by Nova
• Block storage:
• Persists until specifically deleted by user
• Accessible from within VM as a block device (e.g. /dev/vdc)
• Used to add additional persistent storage to VM and/or run operating system
• Managed by Cinder
• Object storage:
• Persists until specifically deleted by user
• Accessible from anywhere
• Used to add store files, including VM images
• Managed by Swift
25
Summary
▪ Users log into Horizon and initiates VM creation
▪ Keystone authorizes
▪ Nova initiates provisioning and saves state to DB
▪ Nova Scheduler finds appropriate host
▪ Neutron configures networking
▪ Cinder provides block device
▪ Image URI is looked up through Glance
▪ Image is retrieved via Swift
▪ VM is rendered by Hypervisor
26
Thank You!
27
CLOUD COMPUTING
Private Cloud Implementation using OpenStack
• VM Creation
Overview
• Accessing VM by User
• VM Termination
Meghamala - IITKgp Cloud
(using OpenStack)
Horizon Login Page
Overview of OpenStack Compute Nodes
Graphical representation of resource usage
192.164.0.1
10.4.0.1
192.164.0.2
10.4.0.2
192.164.0.3
10.4.0.3
192.164.0.4
10.4.0.4
192.164.0.5
10.4.0.5
192.164.0.6
10.4.0.6
192.164.0.7
10.4.0.7
192.164.0.8
10.4.0.8
Details of Instances
Cinder- details of Volumes
Glance- Overview of available images in Meghamala cloud
Neutron- Network Access Rules of a Security
Group
Nova-vCPU, RAM, Storage details of Hypervisors
Nova- Different flavors of VMs in Meghamala
Images of Cloud Instance in Meghamala
Compute Services in Meghamala
VM Creation
192.164.0.1
192.168.0.1
10.4.0.1
192.164.0.2
192.168.0.2
10.4.0.2
192.164.0.3
192.168.0.3
10.4.0.3
192.164.0.4
192.168.0.4
10.4.0.4
192.164.0.5
192.168.0.5
10.4.0.5
192.164.0.6
192.168.0.6
10.4.0.6
192.164.0.7
192.168.0.7
10.4.0.7
192.164.0.8
192.168.0.0
10.4.0.7
10.4.0.8
192.164.0.1
192.168.0.1
10.4.0.1
192.164.0.2
192.168.0.2
10.4.0.2
192.164.0.3
192.168.0.3
10.4.0.3
192.164.0.4
192.168.0.4
10.4.0.4
192.164.0.5
192.168.0.5
10.4.0.5
192.164.0.6
192.168.0.6
10.4.0.6
192.164.0.7
192.168.0.7
10.4.0.7
192.164.0.8
192.168.0.0
10.4.0.7
10.4.0.8
192.164.0.1
10.4.0.1
192.164.0.2
10.4.0.2
192.164.0.3
10.4.0.3
192.164.0.4
10.4.0.4
192.164.0.5
10.4.0.5
192.164.0.6
10.4.0.6
192.164.0.7
10.4.0.7
192.164.0.8
10.4.0.8
Accessing VM by User
Accessing of newly created VM through X2Go
Client
Accessing newly created VM - ‘cloud-nptel’
VM Termination
192.164.0.1
10.4.0.1
192.164.0.2
10.4.0.2
192.164.0.3
10.4.0.3
192.164.0.4
10.4.0.4
192.164.0.5
10.4.0.5
192.164.0.6
10.4.0.6
192.164.0.7
10.4.0.7
192.164.0.8
10.4.0.8
192.164.0.1
10.4.0.1
192.164.0.2
10.4.0.2
192.164.0.3
10.4.0.3
192.164.0.4
10.4.0.4
192.164.0.5
10.4.0.5
192.164.0.6
10.4.0.6
192.164.0.7
10.4.0.7
192.164.0.8
10.4.0.8
192.164.0.1
10.4.0.1
192.164.0.2
10.4.0.2
192.164.0.3
10.4.0.3
192.164.0.4
10.4.0.4
192.164.0.5
10.4.0.5
192.164.0.6
10.4.0.6
192.164.0.7
10.4.0.7
192.164.0.8
10.4.0.8
192.164.0.9
10.4.0.9
Thank You!
38
CLOUD COMPUTING
CREATE A PYTHON WEB APP IN MICROSOFT AZURE:
• With Azure, developers get the freedom to build and deploy wherever they
want, using the tools, applications and frameworks of their choice.
Ref: https://azure.microsoft.com/en-in/
Deploy anywhere with your choice of tools
• Connecting cloud and on-premises with consistent hybrid cloud capabilities
and using open source technologies
Ref: https://azure.microsoft.com/en-in/
Protect your business with the most trusted cloud
• Azure helps to protect assets through a rigorous methodology and focus on
security, privacy, compliance and transparency.
Ref: https://azure.microsoft.com/en-in/
Accelerate app innovation
• Build simple to complex projects within a consistent portal experience using
deeply-integrated cloud services, so developers can rapidly develop, deploy
and manage their apps.
Ref: https://azure.microsoft.com/en-in/
Power decisions and apps with insights
• Uncover business insights with advanced analytics and data services for both
traditional and new data sources. Detect anomalies, predict behaviors and
recommend actions for your business.
Ref: https://azure.microsoft.com/en-in/
In this demo, we are going to present the creation of a
python web app in Microsoft Azure.
Ref: https://azure.microsoft.com/en-in/
Azure Web Apps
• Highly scalable, Self-patching web hosting service.
• Prerequisites
To complete this demo:
Install Git
Install Python
Ref: https://azure.microsoft.com/en-in/
Go to https://portal.azure.com/and login with your username
and password
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Login with your username and password
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Launch Azure Cloud Shell : It is a free bash shell that we can directly use
within the Azure portal
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Download the sample
In a terminal window, run the following command to clone the sample app repository to your
local machine.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Change to the directory that contains the
sample code
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Install flask
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Run the app locally
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Open a web browser, and navigate to the sample app at http://localhost:5000.
You can see the Hello World message from the sample app displayed in the
page.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Configure a deployment user
using the command
A deployment user is required for FTP and local Git
deployment to a web app.
az webapp deployment user set --user-name <username> --
password <password>
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Create a resource group: A resource group is a logical container into which Azure resources
like web apps, databases, and storage accounts are deployed and managed.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Create an Azure App Service plan
An App Service plan specifies the location, size, and features of the web
server farm that hosts your app. You can save money when hosting multiple
apps by configuring the web apps to share a single App Service plan.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Create an Azure App Service plan
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Create a web app
The web app provides a hosting space for your code and provides a URL to
view the deployed app.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Create a web app
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Browse to the site azappshu001.azurewebsites.net to see your
newly created web app.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Configure to use Python: Setting the Python version this way uses a
default container provided by the platform.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Configure local Git deployment
App Service supports several ways to deploy content to a web app, such as
FTP, local Git, GitHub, Visual Studio Team Services, and Bitbucket. For this
quickstart, you deploy by using local Git. That means you deploy by using a Git
command to push from a local repository to a repository in Azure.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Configure local Git deployment
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Push to Azure from Git: Add an Azure remote to your local Git
repository.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Push to the Azure remote to deploy your app. You are prompted for the password you
created earlier when you created the deployment user. Make sure that you enter the
password you created in Configure a deployment user, not the password you use to
log in to the Azure portal.
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Browse to the app at azappshu001.azurewebsites.net
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Update and redeploy the code
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Using a local text editor, open the main.py file in the
Python app, and make a small change
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Commit your changes in Git
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Push the code changes to Azure
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
Once deployment has completed, refresh the page
azappshu001.azurewebsites.net
Ref: https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-python
References
1. https://docs.microsoft.com/en-us/azure/app-service-web/app-service-web-get-started-
python
Thank You!!
Google Cloud Platform (GCP)
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
What’s Google Cloud Platform?
Google Cloud Platform is a set of
services that enables developers to
build, test and deploy applications on
Google’s reliable infrastructure.
2
Google Cloud Platform Services!
3
Why Google Cloud Platform?
Run on Google’s Infrastructure
Global Network
Redundancy
Innovative Infrastructure
4
Why Google Cloud Platform? (contd..)
Focus on your product
Rapidly develop, deploy and iterate your applications without worrying about
system administration. Google manages your application, database and
storage servers so you don’t have to.
Managed services
Developer Tools and SDKs
Console and Administration
5
Why Google Cloud Platform? (contd..)
Mix and Match Services
Compute
Storage
Services
6
Why Google Cloud Platform? (contd..)
Scale to millions of users
Scale-up: Cloud Platform is designed to scale like Google’s own products, even when you
experience a huge traffic spike. Managed services such as App Engine or Cloud Datastore give
you auto-scaling that enables your application to grow with your users.
Scale-down: Just as Cloud Platform allows you to scale-up, managed services also scale down.
You don’t pay for computing resources that you don’t need.
7
Why Google Cloud Platform? (contd..)
Performance you can count on
Google’s compute infrastructure gives you consistent CPU, memory and disk
performance. The network and edge cache serve responses rapidly to your
users across the world.
8
Why Google Cloud Platform? (contd..)
9
Google Cloud Platform Services
I. Cloud Platform offers both a fully managed platform and
flexible virtual machines, allowing you to choose a
system that meets your needs.
II. Use App Engine, a Platform-as-a-Service, when you
just want to focus on your code and not worry about
patching or maintenance.
III. Get access to raw virtual machines with Compute
Engine and have the flexibility to build anything you
need.
10
Google Cloud Platform Services
I. Google Cloud Platform provides a range of storage services that allow you
to maintain easy and quick access to your data.
II. With Cloud SQL and Datastore you get MySQL or NoSQL databases,
while Cloud Storage provides flexible object storage with global edge
caching.
11
Google Cloud Platform Services
II. You don’t need to build these from scratch, just take
advantage of easy integration within Cloud Platform.
12
Google Cloud Platform Services – from User end!
Your application should go wherever your users go: Scale your application
using GoogleCloudEndpoints.
13
Example 1: Host your web-page in Google Cloud Platform
14
Example 1: Host your web-page in Google Cloud Platform
15
An easy example: Host your web-page inside Google Cloud Platform
16
An easy example: Host your web-page inside Google Cloud Platform
ii) In the list of buckets, find the bucket you created.
And Click the more actions icon next to the bucket and select Edit
configuration.
17
An easy example: Host your web-page inside Google Cloud Platform
iii) In the Configure website dialog, specify the Main Page and the 404
(Not Found) Page or even your web-site folder!
Check whether all are
shared publicly!
18
An easy example: Host your web-page inside Google Cloud Platform
iv) Get the public link of your html of home-page or index.html
19
And you are ready to go!
https://storage.googleapis.com/gcp-
webpage/GCP-Webpage/index1.html
20
Example 2: Build your web-app using Google App Engine
21
Another example: Host your web-app using Google App Engine
i) Open the Google Cloud Platform Console & create a new project using
Cloud Platform project and App Engine application
22
Another example: Host your web-app using Google App Engine
ii) When prompted, select the region where you want your App Engine
application located.
23
Another example: Host your web-app using Google App Engine
iii) Select your preferred programming language to build your app.
24
iv) Activate your Google Cloud Shell .
25
v) Clone the Hello World sample app repository and go to the directory that
contains the sample code
26
v) Each application must contain ‘app.yaml’ and code base ‘main.py’ [with
Flask web app deployment ]
27
vi) From within the hello_world directory where the app's
app.yaml configuration file is located, start the local development server :
dev_appserver.py $PWD
28
Visit in your web browser to view the app
29
You can shut-down the development server at any point!
30
You can leave the development server running while you develop your application.
The development server watches for changes in your source files and reloads them
if necessary
Edit main.py
31
Edit main.py
32
Reload the web-page
33
Now deploy your app to App Engine : gcloud app deploy app.yaml - - project gcp-pythonapp
34
Now deploy your app to App Engine : gcloud app deploy app.yaml - - project gcp-pythonapp
35
View your application : gcloud app browse
36
View your application : gcloud app browse
37
You have successfully deployed an web-app!
38
Some Useful Links!
39
References
• https://cloud.google.com/storage/docs/
• https://cloud.google.com/why-google/
• https://cloud.google.com/products/
• http://fethidilmi.blogspot.com
• https://www.slideshare.net/delphiexile/google-cloud-platform-overview-28927697
40
41
CLOUD COMPUTING
SLA - Tutorial
2
Problem-1
Cloud SLA: Suppose a cloud guarantees service availability for 99% of time. Let
a third party application runs in the cloud for 12 hours/day. At the end of one
month, it was found that total outage is 10.75 hrs.
Find out whether the provider has violated the initial availability guarantee.
3
Problem-2
Consider a scenario where a company X wants to use a cloud service from a provider P. The service level agreement (SLA) guarantees negotiated
between the two parties prior to initiating business are as follows:
• Availability guarantee: 99.95% time over the service period
• Service period: 30 days
• Maximum service hours per day: 12 hours
• Cost: $50 per day
Service credits are awarded to customers if availability guarantees are not satisfied. Monthly connectivity uptime service level are given as:
However, in reality in was found that over the service period, the cloud service suffered five outages of durations:
5 hrs, 30 mins, 1 hr 30 mins, 15 mins, and 2 hrs 25 mins, each on different days, due to which normal service guarantees were violated.
If SLA negotiations are honored, compute the effective cost payable towards buying the cloud service.
4
5
Cloud Computing : Economics
Tutorial
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
Cloud Properties: Economic Viewpoint
• Common Infrastructure
– pooled, standardized resources, with benefits generated by statistical
multiplexing.
• Location-independence
– ubiquitous availability meeting performance requirements, with benefits
deriving from latency reduction and user experience enhancement.
• Online connectivity
– an enabler of other attributes ensuring service access. Costs and
performance impacts of network architectures can be quantified using
traditional methods.
9/11/2017 2
Cloud Properties: Economic Viewpoint (contd…)
• Utility pricing
– usage-sensitive or pay-per-use pricing, with benefits applying in
environments with variable demand levels.
• on-Demand Resources
– scalable, elastic resources provisioned and de-provisioned without
delay or costs associated with change.
9/11/2017 3
Utility Pricing in Detail
𝑇
D(t) demand for resources 0<t<T CT= 0
𝑈 ⨯ 𝐵 ⨯ 𝐷 𝑡 𝑑𝑡 = 𝐴 ⨯ 𝑈 ⨯ B ⨯ T
P max (D(t)) : Peak Demand BT= P ⨯ B ⨯ T
A Avg (D(t)) : Average Demand Because the baseline should
B Baseline (owned) unit cost handle peak demand
[BT : Total Baseline Cost]
When is cloud cheaper than owning?
C Cloud unit cost
[CT : Total Cloud Cost] CT< BT A⨯ U⨯ B⨯T<P⨯B⨯T
𝑃
U (=C/B) Utility Premium U<
[For rental car example, U=4.5] 𝐴
When utility premium is less than ratio
of peak demand to Average demand
9/11/2017 4
Utility Pricing in Real World
• In practice demands are often highly spiky
– News stories, marketing promotions, product launches, Internet
flash floods, Tax season, Christmas shopping, etc.
• Often a hybrid model is the best
– You own a car for daily commute, and rent a car when traveling or
when you need a van to move
– Key factor is again the ratio of peak to average demand
– But we should also consider other costs
• Network cost (both fixed costs and usage costs)
• Interoperability overhead
• Consider Reliability, accessibility
9/11/2017 5
Value of on-Demand Services
• Simple Problem: When owning your resources, you will pay a penalty
whenever your resources do not match the instantaneous demand
I. Either pay for unused resources, or suffer the penalty of missing service delivery
D(t) – Instantaneous Demand at time t
R(t) – Resources at time t
If demand is flat, penalty = 0
Penalty Cost α |D(t) – R(t)|dt If demand is linear periodic provisioning
is acceptable
9/11/2017 6
Penalty Costs for Exponential Demand
• Penalty cost ∝ |𝐷 𝑡 − 𝑅 𝑡 |𝑑𝑡
• If demand is exponential (D(t)=et), any
fixed provisioning interval (tp) according
to the current demands will fall
exponentially behind
• R(t) = 𝑒 𝑡−𝑡𝑝
• D(t) – R(t) = 𝑒 𝑡 − 𝑒 𝑡−𝑡𝑝 = 𝑒 𝑡 1 − 𝑒 𝑡𝑝 =
𝑘1 𝑒 𝑡
• Penalty cost ∝c.k1et
9/11/2017 7
Assignment 1
Consider the peak computing demand for an organization is 120 units. The demand as a function of
time can be expressed as:
50 sin 𝑡 , 0≤𝑡<𝜋 2
𝐷 𝑡 = 𝜋 ≤𝑡<𝜋
20 sin 𝑡 , 2
The resource provisioned by the cloud to satisfy current demand at time t is given as:
𝑑𝐷 𝑡
𝑅 𝑡 = 𝐷 𝑡 + 𝛿. ( )
𝑑𝑡
8
Assignment 2
Consider that the peak computing demand for an organization is 100 units.
The demand as a function of time can be expressed as
_
D 𝑡 = 50(1 + 𝑒 𝑡)
Baseline (owned) unit cost is 120 and cloud unit cost is 200.
In this situation is cloud cheaper than owning for a period of 100 time units?
Assignment 3
A company X needs to support a spike in demand when it becomes popular, followed potentially by a
reduction once some of the visitors turn away. The company has two options to satisfy the
requirements which are given in the following table:
Expenditures In-house server (INR) Cloud server
Purchase cost 6,00,000 -
Number of CPU cores 12 8
Cost/hour (over three year span) - 42
1
Introduction
• MapReduce: programming model developed at Google
• Objective:
– Implement large scale search
– Text processing on massively scalable web data stored using BigTable and GFS distributed file
system
• Designed for processing and generating large volumes of data via massively parallel
computations, utilizing tens of thousands of processors at a time
• Fault tolerant: ensure progress of computation even if processors and networks fail
• Example:
– Hadoop: open source implementation of MapReduce (developed at Yahoo!)
– Available on pre-packaged AMIs on Amazon EC2 cloud platform
9/11/2017 2
MapReduce Model
• Parallel programming abstraction
• Used by many different parallel applications which carry out large-scale
computation involving thousands of processors
• Leverages a common underlying fault-tolerant implementation
• Two phases of MapReduce:
– Map operation
– Reduce operation
• A configurable number of M ‘mapper’ processors and R ‘reducer’ processors are
assigned to work on the problem
• The computation is coordinated by a single master process
9/11/2017 3
MapReduce Model Contd…
• Map phase:
– Each mapper reads approximately 1/M of the input from the global file
system, using locations given by the master
– Map operation consists of transforming one set of key-value pairs to
another:
9/11/2017 4
MapReduce Model Contd…
• Reduce phase:
– The master informs the reducers where the partial computations have been stored
on local files of respective mappers
– Reducers make remote procedure call requests to the mappers to fetch the files
– Each reducer groups the results of the map step using the same key and performs a
function f on the list of values that correspond to these key value:
9/11/2017 5
MapReduce: Example
• 3 mappers; 2 reducers
• Map function:
• Reduce function:
9/11/2017 6
Problem-1
9/11/2017 7
Problem-2
Write the pseudo-codes (for map and reduce functions) for calculating
the average of a set of integers in MapReduce.
Suppose A = (10, 20, 30, 40, 50) is a set of integers. Show the map and
reduce outputs .
9/11/2017 8
Problem-3
Compute total and average salary of organization XYZ and group by
based on gender (male or female) using MapReduce. The input is as
follows
Name, Gender, Salary
John, M, 10,000
Martha, F, 15,000
----
9/11/2017 9
Problem-4
Write the Map and Reduce functions (pseudo-codes) for the following Word
Length Categorization problem under MapReduce model.
Categories:
tiny: 1-2 letters; small: 3-5 letters; medium: 6-9 letters; big: 10 or more letters
9/11/2017 10
11
CLOUD COMPUTING
Resource Management - I
Source: http://www.cse.hcmut.edu.vn/~ptvu/gc/2012/GC-pp.pdf
2
Resources types
• Physical resource
Computer, disk, database, network, scientific instruments.
• Logical resource
Execution, monitoring, communicate application .
Source: http://www.cse.hcmut.edu.vn/~ptvu/gc/2012/GC-pp.pdf
3
Resources Management
Source: http://www.cse.hcmut.edu.vn/~ptvu/gc/2012/GC-pp.pdf
4
Data Center Power Consumption
Ref: Efficient Resource Management for Cloud Computing Environments, by Andrew J. Younge,
Gregor von Laszewski, Lizhe Wang, Sonia Lopez-Alarcon, Warren Carithers,
5
Motivation for Green Data Centers
Economic Environmental
• New data centers run on the • Majority of energy sources are fossil
Megawatt scale, requiring millions of fuels.
dollars to operate. • Huge volume of CO2 emitted each year
from power plants.
• Recently institutions are looking for
new ways to reduce costs • Sustainable energy sources are not
ready.
• Many facilities are at their peak
operating stage, and cannot expand • Need to reduce energy dependence
without a new power source.
6
Green Computing ?
• Advanced scheduling schemas to reduce energy consumption.
• Power aware
• Thermal aware
7
Research Directions
How to conserve energy within a Cloud environment.
• Schedule VMs to conserve energy.
• Management of both VMs and underlying infrastructure.
• Minimize operating inefficiencies for non-essential tasks.
• Optimize data center design.
8
Steps towards Energy Efficiency
Green Cloud
Framework
Virtual Data
Machine Center
Controls Design
Server &
Rack Air Cond. &
Scheduling Management
Recirculation
Design
Power Thermal VM Image Dynamic
Migration
Aware Aware Design Shutdown
9
VM scheduling on Multi-core Systems
180
160
between the number of processes
150
used and power consumption
140
Watts
• We can schedule VMs to take 130
120
advantage of this relationship in
110
order to conserve power 100
90
0 1 2 3 4 5 6 7 8
Number of Processing Cores
10
Power-aware Scheduling
Scheduling
11
485 Watts vs. 552 Watts !
V V V V V V V V
M M M M M M M M
Node 1 @ 170W Node 2 @ 105W
12
VM Management
• Monitor Cloud usage and load.
• When load decreases:
• Live migrate VMs to more utilized nodes.
• Shutdown unused nodes.
13
VM VM VM VM
1
Node 1 Node 2
VM VM VM VM VM
2
Node 1 Node 2
VM VM VM VM
3
Node 1 Node 2
VM VM VM VM
4
Node 1 Node 2 (offline)
14
Minimizing VM Instances
• Virtual machines are loaded!
• Lots of unwanted packages.
• Unneeded services.
• Are multi-application oriented, not service oriented.
• Clouds are based off of a Service Oriented Architecture.
• Need a custom lightweight Linux VM for service oriented science.
• Need to keep VM image as small as possible to reduce network latency.
Management
15
Typical Cloud Linux Image
• Start with Ubuntu 9.04.
• Remove all packages not
• required for base image.
• No X11
• No Window Manager
• Minimalistic server install
• Can load language support on demand (via package
manager)
• Readahead profiling utility.
• Reorder boot sequence
• Pre-fetch boot files on disk
• Minimize CPU idle time due to I/O delay
• Optimize Linux kernel.
• Built for Xen DomU
• No 3d graphics, no sound, minimalistic kernel
• Build modules within kernel directly
VM Image
Design
16
Energy Savings
• Reduced boot times from 38 seconds to just 8 seconds.
• 30 seconds @ 250Watts is 2.08wh or .002kwh.
• In a small Cloud where 100 images are created every hour.
• Saves .2kwh of operation @ 15.2c per kwh.
• At 15.2c per kwh this saves $262.65 every year.
• In a production Cloud where 1000 images are created every minute.
• Saves 120kwh less every hour.
• At 15.2c per kwh this saves over 1 million dollars every year.
• Image size from 4GB to 635MB.
• Reduces time to perform live-migration.
• Can do better.
VM Image
17
Design
Summary - 1
• Cloud computing is an emerging topic in Distributed Systems.
• Need to conserve energy wherever possible!
• Green Cloud Framework:
• Power-aware scheduling of VMs.
• Advanced VM & infrastructure management.
• Specialized VM Image.
• Small energy savings result in a large impact.
• Combining a number of different methods together can have a larger impact then
when implemented separately.
18
Summary - 2
19
Thank you!
CLOUD COMPUTING
Resource Management - II
Source: http://www.cse.hcmut.edu.vn/~ptvu/gc/2012/GC-pp.pdf
2
Resources types
• Physical resource
Computer, disk, database, network, scientific instruments.
• Logical resource
Execution, monitoring, communicate application .
Source: http://www.cse.hcmut.edu.vn/~ptvu/gc/2012/GC-pp.pdf
3
Resources Management
Source: http://www.cse.hcmut.edu.vn/~ptvu/gc/2012/GC-pp.pdf
4
Resource Management for IaaS
Source:
http://www.zearon.com/down/Resource%20management%20for%20Infrastructure%20as%20a%20Service%20%28IaaS%29%20in%20cloud%
20computing%20A%20survey.pdf
5
Resource Management - Objectives
• Scalability
• Quality of service
• Optimal utility
• Reduced overheads
• Improved throughput
• Reduced latency
• Specialized environment
• Cost effectiveness
• Simplified interface
6
Resource Management – Challenges (Hardware)
7
Resource Management – Challenges (Logical resources)
• Operating system
• Energy
• Network throughput/bandwidth
• Load balancing mechanisms
• Information security
• Delays
• APIs/(Applications Programming Interfaces)
• Protocols
8
Resource Management Aspects
• Resource provisioning
• Resource allocation
• Resource requirement mapping
• Resource adaptation
• Resource discovery
• Resource brokering
• Resource estimation
• Resource modeling
9
Resource Management
Type Details
Resource provisioning Allocation of a service provider's resources to a customer
Resource allocation Distribution of resources economically among competing groups of people or programs
Resource adaptation Ability or capacity of that system to adjust the resources dynamically to fulfill the requirements of the user
Resource mapping Correspondence between resources required by the users and resources available with the provider
Resource modeling Resource modeling is based on detailed information of transmission network elements, resources and entities
participating in the network.
Attributes of resource management: states, transitions, inputs and outputs within a given environment.
Resource modeling helps to predict the resource requirements in subsequent time intervals
Resource estimation A close guess of the actual resources required for an application, usually with some thought or calculation involved
Resource discovery and Identification of list of authenticated resources that are available for job submission and to choose the best among them
selection
Resource brokering It is the negotiation of the resources through an agent to ensure that the necessary resources are available at the right
time to complete the objectives
Resource scheduling A resource schedule is a timetable of events and resources. Shared resources are available at certain times and events are
planned during these times. In other words, It is determining when an activity should start or end, depending on its
(1) duration, (2) predecessor activities, (3) predecessor relationships, and (4) resources allocated
10
Resource Provisioning Approaches
Approach Description
Nash equilibrium approach using Game Run time management and allocation of IaaS resources considering several criteria such as the heterogeneous
theory distribution of resources, rational exchange behaviors of cloud users, incomplete common information and dynamic
successive allocation
Network queuing model Presents a model based on a network of queues, where the queues represent different tiers of the application.The model
sufficiently captures the behavior of tiers with significantly different performance characteristics and application
idiosyncrasies, such as, session-based workloads, concurrency limits, and caching at intermediate tiers
Prototype provisioning Employs the k-means clustering algorithm to automatically determine the workload mix and a queuing model to predict
the server capacity for a given workload mix.
Resource (VM) provisioning Uses virtual machines (VMs) that run on top of the Xen hypervisor. The system provides a Simple Earliest Deadline First
(SEDF) scheduler that implements weighted fair sharing of the CPU capacity among all the VMs
The share of CPU cycles for a particular VM can be changed at runtime
Adaptive resource provisioning Automatic bottleneck detection and resolution under dynamic resource management which has the potential to enable
cloud infrastructure providers to provide SLAs for web applications that guarantee specific response time requirements
while minimizing resource utilization.
SLA oriented methods Handling the process of dynamic provisioning to meet user SLAs in autonomic manner. Additional resources are
provisioned for applications when required and are removed when they are not necessary
Dynamic and automated framework A dynamic and automated framework which can adapt the adaptive parameters to meet the specific accuracy goal, and
then dynamically converge to near-optimal resource allocation to handle unexpected changes
Optimal cloud resource provisioning The demand and price uncertainty is considered using optimal cloud resource provisioning (OCRP) including deterministic
(OCRP) equivalent formulation, sample-average approximation, etc.
11
Resource Allocation Approaches
Approach Description
Market-oriented resource Considers the case of a single cloud provider and address the question how to best match customer demand in terms of both
allocation supply and price in order to maximize the providers revenue and customer satisfactions while minimizing energy cost. In
particular, it models the problem as a constrained discrete-time optimal control problem and uses Model Predictive
Control(MPC) to find its solution
Intelligent multi-agent model An intelligent multi-agent model based on virtualization rules for resource virtualization to automatically allocate service
resources suitable for mobile devices. It infers user demand by analyzing and learning user context information.
Energy-Aware Resource Resource allocation is carried out by mimicking the behavior of ants, that the ants are likely to choose the path identified as a
allocation shortest path, which is indicated by a relatively higher density of pheromone left on the path compared to other possible paths
Measurement based analysis Focuses on measurement based analysis on performance impact of co-locating applications in a virtualized cloud in terms of
on performance throughput and resource sharing effectiveness, including the impact of idle instances on applications that are running
concurrently on the same physical host
Dynamic resource allocation Dynamic resource allocation method based on the load of VMs on IaaS, which enables users to dynamically add and/or delete
method one or more instances on the basis of the load and the conditions specified by the user
Real time resource allocation Designed for helping small and medium sized IaaS cloud providers to better utilize their hardware resources with minimum
mechanism operational cost by a well-designed underlying hardware infrastructure, an efficient resource scheduling algorithm and a set of
migrating operations of VMs
Dynamic scheduling and Presents the architecture and algorithmic blueprints of a framework for workload co-location, which provides customers with
consolidation mechanism the ability to formally express workload scheduling flexibilities using Directed Acyclic Graphs (DAGs), and optimizes the use of
cloud resources to collocate client's workloads
12
Resource Mapping Approaches
Approach Description
Symmetric mapping pattern Symmetric mapping pattern for the design of resource supply systems. It divides resource supply in three functions: (1) users and
providers match and engage in resource supply agreements, (2) users place tasks on subscribed resource containers, and (3) providers
place supplied resource containers on physical resources
Load-aware mapping Explores how to simplify VM image management and reduce image preparation overhead by the multicast file transferring and image
caching/reusing. Load-Aware Mapping to further reduce deploying overhead and make efficient use of resources.
Minimum congestion mapping Framework for solving a natural graph mapping problem arising in cloud computing. Applying this framework to obtain offline and
online approximation algorithms for workloads given by depth-d trees and complete graphs
Iterated local search based Request partitioning approach based on iterated local search is introduced that facilitates the cost- efficient and on-line splitting of user
request partitioning requests among eligible Cloud Service Providers (CSPs) within a networked cloud environment
SOA API Designed to accept different resource usage prediction models and map QoS constraints to resources from various IaaS providers
Impatient task mapping Batch mapping via genetic algorithms with throughput as a fitness function that can be used to map jobs to cloud resources
Distributed ensembles of Requirements are inferred by observing the behavior of the system under different conditions and creating a model that can be later
virtual appliances (DEVAs) used to obtain approximate parameters to provide the resources.
Mapping a virtual network An effective method (using backbone mapping) for computing high quality mappings of virtual networks onto substrate networks. The
onto a substrate network computed virtual networks are constructed to have sufficient capacity to accommodate any traffic pattern allowed by user-specified
traffic constraints.
13
Resource Adaptation Approaches
Approach Description
Reinforcement learning A multi-input multi-output feedback control model-based dynamic resource provisioning algorithm which adopts reinforcement
guided control policy learning to adjust adaptive parameters to guarantee the optimal application benefit within the time constraint
Web-service based A web-service based prototype framework, and used it for performance evaluation of various resource adaptation algorithms
prototype under different realistic settings
OnTimeMeasure service Presents an application – adaptation case study that uses OnTimeMeasure-enabled performance intelligence in the context of
dynamic resource allocation within thin-client based virtual desktop clouds to increase cloud scalability, while simultaneously
delivering satisfactory user quality-of-experience
Virtual networks Proposes virtual networks architecture as a mechanism in cloud computing that can aggregate traffic isolation, improving security
and facilitating pricing, also allowing customers to act in cases where the performance is not in accordance with the contract for
services
DNS-based Load Proposes a system that contain the appropriate elements so that applications can be scaled by replicating VMs (or application
Balancing containers), by reconfiguring them on the fly, and by adding load balancers in front of these replicas that can scale by themselves
Hybrid approach Proposes a mechanism for providing dynamic management in virtualized consolidated server environments that host multiple
multi-tier applications using layered queuing models for Xen-based virtual machine environments, which is a novel optimization
technique that uses a combination of bin packing and gradient search
14
Performance Metrics for Resource Management
• Reliability
• Ease of deployment
• QoS
• Delay
• Control overhead
15
Thank you!
CLOUD COMPUTING
CLOUD SECURITY I
2
Security Attacks
Any action that compromises the security of
information.
Four types of attack:
1. Interruption
2. Interception
3. Modification
4. Fabrication S D
Basic model: Destination
Source
3
Security Attacks (contd.) S D
Interruption:
Attack on availability
S D
Interception:
Attack on confidentiality
I
4
D
Security Attacks S
Modification:
Attack on integrity I
D
S
Fabrication:
Attack on authenticity
I
5
Classes of Threats
Disclosure
Snooping
Deception
Modification, spoofing, repudiation of origin, denial of receipt
Disruption
Modification
Usurpation
Modification, spoofing, delay, denial of service
6
Policies and Mechanisms
7
Goals of Security
Prevention
Prevent attackers from violating security policy
Detection
Detect attackers’ violation of security policy
Recovery
Stop attack, assess and repair damage
Continue to function correctly even if attack succeeds
8
Trust and Assumptions
Underlie all aspects of security
Policies
Unambiguously partition system states
Correctly capture security requirements
Mechanisms
Assumed to enforce policy
Support mechanisms work correctly
9
Types of Mechanisms
10
Assurance
Specification
Requirements analysis
Statement of desired functionality
Design
How system will meet specification
Implementation
Programs/systems that carry out design
11
Operational Issues
Cost-Benefit Analysis
Is it cheaper to prevent or recover?
Risk Analysis
Should we protect something?
How much should we protect this thing?
Laws and Customs
Are desired security measures illegal?
Will people do them?
12
Human Issues
Organizational Problems
Power and responsibility
Financial benefits
People problems
Outsiders and insiders
Social engineering
13
Tying Together
Threats
Policy
Specification
Design
Implementation
Operation
14
Passive and Active Attacks
Passive attacks
Obtain information that is being transmitted
(eavesdropping).
Two types:
Release of message contents:- It may be desirable to
prevent the opponent from learning the contents of the
transmission.
Traffic analysis:- The opponent can determine the
location and identity of communicating hosts, and
observe the frequency and length of messages being
exchanged.
Very difficult to detect.
15
Active attacks
Involve some modification of the data stream or the
creation of a false stream.
Four categories:
Masquerade:- One entity pretends to be a different entity.
Replay:- Passive capture of a data unit and its subsequent
retransmission to produce an unauthorized effect.
Modification:- Some portion of a legitimate message is
altered.
Denial of service:- Prevents the normal use of
communication facilities.
16
Security Services
Confidentiality (privacy)
Authentication (who created or sent the data)
Integrity (has not been altered)
Non-repudiation (the order is final)
Access control (prevent misuse of resources)
Availability (permanence, non-erasure)
Denial of Service Attacks
Virus that deletes files
17
Role of Security
A security infrastructure provides:
Confidentiality – protection against loss of privacy
Integrity – protection against data alteration/ corruption
Availability – protection against denial of service
Authentication – identification of legitimate users
Authorization – determination of whether or not an
operation is allowed by a certain user
Non-repudiation – ability to trace what happened, &
prevent denial of actions
Safety – protection against tampering, damage & theft
18
Types of Attack
Social engineering/phishing
Physical break-ins, theft, and curb shopping
Password attacks
Buffer overflows
Command injection
Denial of service
Exploitation of faulty application logic
Snooping
Packet manipulation or fabrication
Backdoors
19
Network Security…
Network security works like this:
Determine network security policy
Implement network security policy
Reconnaissance
Vulnerability scanning
Penetration testing
Post-attack investigation
20
Step 1: Determine Security Policy
A security policy is a full security roadmap
Usage policy for networks, servers, etc.
User training about password sharing, password strength,
social engineering, privacy, etc.
Privacy policy for all maintained data
A schedule for updates, audits, etc.
The network design should reflect this policy
The placement/protection of database/file servers
The location of demilitarized zones (DMZs)
The placement and rules of firewalls
The deployment of intrusion detection systems (IDSs)
21
Step 2: Implement Security Policy
Implementing a security policy includes:
Installing and configuring firewalls
iptables is a common free firewall configuration for Linux
22
Step 2: Implement Security Policy
23
Step 2: Implement Security Policy
Firewall
Applies filtering rules to packets passing through it
Comes in three major types:
Packet filter – Filters by destination IP, port or protocol
Stateful – Records information about ongoing TCP sessions, and ensures out-of-
session packets are discarded
Application proxy – Acts as a proxy for a specific application, and scans all layers
for malicious data
Intrusion Detection System (IDS)
Scans the incoming messages, and creates alerts when suspected scans/attacks are
in progress
Honeypot/honeynet (e.g. honeyd)
Simulates a decoy host (or network) with services
24
Step 3: Reconnaissance
First, we learn about the network
IP addresses of hosts on the network
Identify key servers with critical data
Services running on those hosts/servers
Vulnerabilities on those services
Two forms: passive and active
Passive reconnaissance is undetectable
Active reconnaissance is often detectable by IDS
25
Step 4: Vulnerability Scanning
We now have a list of hosts and services
We can now target these services for attacks
Many scanners will detect vulnerabilities (e.g. nessus)
These scanners produce a risk report
Other scanners will allow you to exploit them (e.g. metasploit)
These scanners find ways in, and allow you to choose the payload to
use (e.g. obtain a root shell, download a package)
The payload is the code that runs once inside
The best scanners are updateable
For new vulnerabilities, install/write new plug-ins
e.g. Nessus Attack Scripting Language (NASL)
26
Step 5: Penetration Testing
We have identified vulnerabilities
Now, we can exploit them to gain access
Using frameworks (e.g. metasploit), this is as simple as
selecting a payload to execute
Otherwise, we manufacture an exploit
We may also have to try to find new vulnerabilities
This involves writing code or testing functions accepting
user input
27
Step 6: Post-Attack Investigation
Forensics of Attacks
This process is heavily guided by laws
Also, this is normally done by a third party
Retain chain of evidence
The evidence in this case is the data on the host
The log files of the compromised host hold the footsteps and
fingerprints of the attacker
Every minute with that host must be accounted for
For legal reasons, you should examine a low-level copy of the disk
and not modify the original
28
Thank You!
29
CLOUD COMPUTING
CLOUD SECURITY II
• Use as much or as less you need, use only when you want,
and pay only what you use
2
Economic Advantages of Cloud Computing
• For consumers:
– No upfront commitment in buying/leasing hardware
– Can scale usage according to demand
– Minimizing start-up costs
• Small scale companies and startups can reduce CAPEX (Capital
Expenditure)
• For providers:
– Increased utilization of datacenter resources
3
Why aren’t Everyone using Cloud?
4
Concern…
5
Survey on Potential Cloud Barriers
6
Why Cloud Computing brings New Threats?
• Traditional system security mostly means keeping attackers out
• The attacker needs to either compromise the authentication/access control system,
or impersonate existing users
• But cloud allows co-tenancy: Multiple independent users share the same physical
infrastructure
– An attacker can legitimately be in the same physical machine as the target
7
Security Stack
• IaaS: entire infrastructure from facilities to hardware
Responsibility
Responsibility
by IaaS
– Customer-side system administrator manages the same with provider
handling platform, infrastructure security
8
Sample Clouds
Source: “Security Guidance for Critical Areas of Focus in Cloud Computing” v2.1, p.18
9
Gartner’s Seven Cloud Computing Security Risks
• Gartner:
– http://www.gartner.com/technology/about.jsp
– Cloud computing has “unique attributes that require risk assessment in areas such as data
integrity, recovery and privacy, and an evaluation of legal issues in areas such as e-
discovery, regulatory compliance and auditing,” Gartner says
• Security Risks
– Privileged User Access
– Regulatory Compliance & Audit
– Data Location
– Data Segregation
– Recovery
– Investigative Support
– Long-term Viability
10
Privileged User Access
• Sensitive data processed outside the enterprise brings with it an inherent
level of risk
• Outsourced services bypass the “physical, logical and personnel controls”
of traditional in-house deployments.
• Get as much information as you can about the people who manage your
data
• “Ask providers to supply specific information on the hiring and oversight
of privileged administrators, and the controls over their access,” Gartner
says.
11
Regulatory Compliance & Audit
• Traditional service providers are subjected to external audits and security
certifications.
• Cloud computing providers who refuse to undergo this scrutiny are “signaling that
customers can only use them for the most trivial functions,” according to Gartner.
• Shared infrastructure – isolation of user-specific log
• No customer-side auditing facility
• Difficult to audit data held outside organization in a cloud
– Forensics also made difficult since now clients don’t maintain data locally
12
Data Location
• Hosting of data, jurisdiction?
• Data centers: located at geographically dispersed locations
• Different jurisdiction & regulations
– Laws for cross border data flows
• Legal implications
– Who is responsible for complying with regulations (e.g., SOX, HIPAA, etc.)?
– If cloud provider subcontracts to third party clouds, will the data still be secure?
13
Data Segregation
• Data in the cloud is typically in a shared environment alongside data from other
customers.
• Encryption is effective but isn’t a cure-all. “Find out what is done to segregate data
at rest,” Gartner advises.
• Encrypt data in transit, needs to be decrypted at the time of processing
– Possibility of interception
• Secure key store
– Protect encryption keys
– Limit access to key stores
– Key backup & recoverability
• The cloud provider should provide evidence that encryption schemes were
designed and tested by experienced specialists.
• “Encryption accidents can make data totally unusable, and even normal encryption
can complicate availability,” Gartner says.
14
Recovery
• Even if you don’t know where your data is, a cloud provider should tell you what will happen
to your data and service in case of a disaster.
• “Any offering that does not replicate the data and application infrastructure across multiple
sites is vulnerable to a total failure,” Gartner says. Ask your provider if it has “the ability to do
a complete restoration, and how long it will take.”
• Recovery Point Objective (RPO): The maximum amount of data that will be lost following an
interruption or disaster.
• Recovery Time Objective (RTO): The period of time allowed for recovery i.e., the time that is
allowed to elapse between the disaster and the activation of the secondary site.
• Backup frequency
• Fault tolerance
– Replication: mirroring/sharing data over disks which are located in separate physical locations to
maintain consistency
– Redundancy: duplication of critical components of a system with the intention of increasing reliability
of the system, usually in the case of a backup or fail-safe.
15
Investigative Support
• Investigating inappropriate or illegal activity may be impossible in cloud
computing
• Monitoring
– To eliminate the conflict of interest between the provider and the consumer, a neural
third-party organization is the best solution to monitor performance.
16
Long-term Viability
• “Ask potential providers how you would get your data back and if it would
be in a format that you could import into a replacement application,”
Gartner says.
• When to switch cloud providers ?
– Contract price increase
– Provider bankruptcy
– Provider service shutdown
– Decrease in service quality
– Business dispute
• Problem: vendor lock-in
17
Other Cloud Security Issues…
• Virtualization
• Access Control & Identity Management
• Application Security
• Data Life Cycle Management
18
Virtualization
• Components:
– Virtual machine (VM)
– Virtual machine manager (VMM) or hypervisor
• Two types:
– Full virtualization: VMs run on hypervisor that interacts with the hardware
– Para virtualization: VMs interact with the host OS.
• Major functionality: resource isolation
• Hypervisor vulnerabilities:
– Shared clipboard technology– transferring malicious programs from VMs to
host
19
Virtualization (contd…)
• Hypervisor vulnerabilities:
– Keystroke logging: Some VM technologies enable the logging of keystrokes and screen
updates to be passed across virtual terminals in the virtual machine, writing to host files
and permitting the monitoring of encrypted terminal connections inside the VM.
– Virtual machine backdoors: covert communication channel
– ARP Poisoning: redirect packets going to or from the other VM.
• Hypervisor Risks
– Rogue hypervisor rootkits
• Initiate a ‘rogue’ hypervisor
• Hide itself from normal malware detection systems
• Create a covert channel to dump unauthorized code
20
Virtualization (contd…)
• Hypervisor Risks
– External modification to the hypervisor
• Poorly protected or designed hypervisor: source of attack
• May be subjected to direct modification by the external intruder
– VM escape
• Improper configuration of VM
• Allows malicious code to completely bypass the virtual environment, and obtain full root or
kernel access to the physical host
• Some vulnerable virtual machine applications: Vmchat, VMftp, Vmcat etc.
– Denial-of-service risk
• Threats:
– Unauthorized access to virtual resources – loss of confidentiality, integrity,
availability
21
Access Control & Identity Management
• Access control: similar to traditional in-house IT network
• Proper access control: to address CIA tenets of information
security
• Prevention of identity theft – major challenge
– Privacy issues raised via massive data mining
• Cloud now stores data from a lot of clients, and can run data mining algorithms to
get large amounts of information on clients
• Identity Management (IDM) – authenticate users and services
based on credentials and characteristics
22
Application Security
• Cloud applications – Web service based
• Similar attacks:
– Injection attacks: introduce malicious code to change the course of execution
– XML Signature Element Wrapping: By this attack, the original body of an XML message is moved
to a newly inserted wrapping element inside the SOAP header, and a new body is created.
– Cross-Site Scripting (XSS): XSS enables attackers to inject client-side script into Web pages viewed
by other users to bypass access controls.
– Flooding: Attacker sending huge amount of request to a certain service and causing denial of
service.
– DNS poisoning and phishing: browser-based security issues
– Metadata (WSDL) spoofing attacks: Such attack involves malicious reengineering of Web Services’
metadata description
• Insecure communication channel
23
Data Life Cycle Management
• Data security
– Confidentiality:
• Will the sensitive data stored on a cloud remain confidential?
• Will cloud compromise leak confidential client data (i.e., fear of loss of
control over data)
• Will the cloud provider itself be honest and won’t peek into the data?
– Integrity:
• How do I know that the cloud provider is doing the computations
correctly?
• How do I ensure that the cloud provider really stored my data without
tampering with it?
24
Data Life Cycle Management (contd.)
• Availability
• Will critical systems go down at the client, if the provider is attacked in
a Denial of Service attack?
• What happens if cloud provider goes out of business?
• Data Location
• All copies, backups stored only at location allowed by contract, SLA
and/or regulation
• Archive
• Access latency
25
Thank You!
26
26
CLOUD COMPUTING
CLOUD SECURITY III
2
New Risks in Cloud
• Trust and dependence
– Establishing new trust relationship between customer and cloud provider
– Customers must trust their cloud providers to respect the privacy of their data
and integrity of their computations
• Security (multi-tenancy)
– Threats from other customers due to the subtleties of how physical resources
can be transparently shared between virtual machines (VMs)
3
Multi-tenancy
• Multiplexing VMs of disjoint customers upon the same physical hardware
– Your machine is placed on the same server with other customers
– Problem: you don’t have the control to prevent your instance from being co-resident with an adversary
• New risks
– Side-channels exploitation
• Cross-VM information leakage due to sharing of physical resource (e.g., CPU’s data caches)
• Has the potential to extract RSA & AES secret keys
– Vulnerable VM isolation mechanisms
• Via a vulnerability that allows an “escape” to the hypervisor
– Lack of control who you’re sharing server space
4
Attack Model
• Motivation
– To study practicality of mounting cross-VM attacks in existing third-party compute clouds
• Experiments have been carried out on real IaaS cloud service provider (Amazon
EC2)
• Two steps of attack:
– Placement: adversary arranging to place its malicious VM on the same physical machine as that of
the target customer
– Extraction: extract confidential information via side channel attack
5
Threat Model
• Assumptions of the threat model:
– Provider and infrastructure to be trusted
– Do not consider attacks that rely on subverting administrator functions
– Do not exploit vulnerabilities of the virtual machine monitor and/or other software
– Adversaries: non-providers-affiliated malicious parties
– Victims: users running confidentiality-requiring services in the cloud
• Focus on new cloud-related capabilities of the attacker and implicitly
expanding the attack surface
6
Threat Model (contd…)
• Like any customer, the malicious party can run and control many
instances in the cloud
– Maximum of 20 instances can be run parallel using an Amazon EC2 account
• Attacker’s instance might be placed on the same physical hardware as
potential victims
• Attack might manipulate shared physical resources to learn otherwise
confidential information
• Two kinds of attack may take place:
– Attack on some known hosted service
– Attacking a particular victim’s service
7
Addresses the Following…
• Q1: Can one determine where in the cloud infrastructure an instance is
located?
• Q2: Can one easily determine if two instances are co-resident on the same
physical machine?
• Q3: Can an adversary launch instances that will be co-resident with other
user’s instances?
• Q4: Can an adversary exploit cross-VM information leakage once co-
resident?
8
Amazon EC2 Service
• Scalable, pay-as-you-go compute capacity in the cloud
• Customers can run different operating systems within a virtual machine
• Three degrees of freedom: instance-type, region, availability zone
• Different computing options (instances) available
– m1.small, c1. medium: 32-bit architecture
– m1.large, m1.xlarge, c1.xlarge: 64-bit architecture
• Different regions available
– US, EU, Asia
• Regions split into availability zones
– In US: East (Virginia), West (Oregon), West (Northern California)
– Infrastructures with separate power and network connectivity
• Customers randomly assigned to physical machines based on their instance, region, and availability
zone choices
9
Amazon EC2 Service (contd…)
• Xen hypervisor
– Domain0 (Dom0): privileged virtual machine
• Manages guest images
• Provisions physical resources
• Access control rights
• Configured to route packets for its guest images and reports itself as a hop in traceroutes.
– When an instance is launched, it is assigned to a single physical machine for its lifetime
• Each instance is assigned internal and external IP addresses and domain names
– External IP: public IPv4 address [IP: 75.101.210.100/domain name: ec2-75-101-210-100.compute-1.amazonaws.com]
– Internal IP: RFC 1918 private address [IP: 10.252.146.52/domain name: domU-12-31-38-00-8D-C6.compute-1.internal]
• Within the cloud, both domain names resolve to the internal IP address
• Outside the cloud, external name is mapped to the external IP address
10
Q1: Cloud Cartography
• Instance placing is not disclosed by Amazon but is needed to launch co-
residency attack
• Map the EC2 service to understand where potential targets are located in the
cloud
• Determine instance creation parameters needed to attempt establishing co-
residence of an adversarial instance
• Hypothesis: different availability zones and instance types correspond to
different IP address ranges
11
Network Probing
• Identify public servers hosted in EC2 and verify co-residence
• Open-source tools have been used to probe ports (80 and 443)
– nmap – perform TCP connect probes (attempt to complete a 3-way hand-shake between a source and target)
– hping – perform TCP SYN traceroutes, which iteratively sends TCP SYN packets with increasing TTLs, until no ACK is
received
– wget – used to retrieve web pages
• External probe: probe originating from a system outside EC2 and has an EC2 instance as
destination
• Internal probe: originates from an EC2 instance, and has destination another EC2 instance
• Given an external IP address, DNS resolution queries are used to determine:
– External name
– Internal IP address
12
Survey Public Servers on EC2
• Goal: to enable identification of the instance type and availability zone of one or more potential targets
• WHOIS: used to identify distinct IP address prefixes associated with EC2
• EC2 public IPs: /17, /18, /19 prefixes
– 57344 IP addresses
• Use external probes to find responsive IPs:
– Performed TCP connect probe on port 80
• 11315 responsive IPs
– Followed up with wget on port 80
• 9558 responsive IPs
– Performed a TCP scan on port 443
• 8375 responsive IPs
• Used DNS lookup service
– Translate each public IP address that responded to either the port 80 or 443 scan into an internal EC2 address
– 14054 unique internal IPs obtained
13
Instance Placement Parameters
• EC2’s internal address space is cleanly partitioned between availability zones
– Three availability zone; five instance-type/zone
– 20 instances launched for each of the 15 availability zone/instance type pairs from a particular
account (Say, Account A)
Samples from each zone are assigned IP addresses from disjoint portions of the observed internal address space
Assumption: internal IP addresses are statically assigned to physical machines
To ease out IP routing
Availability zones use separate physical infrastructure
14
Instance Placement Parameters (contd…)
• 100 instances have been launched in Zone 3 using two different accounts: A & B (39 hours
after terminating the Account A instances)
15
Q2: Determining Co-residence
16
Verifying Co-residency Check
• If two (under self-control) instances can successfully transmit via the covert channel, then they
are co-resident, otherwise not
• Experiment: hard-disk-based covert channel
– To send a 1, sender reads from random locations on a shared volume, to send a 0 sender does nothing
– Receiver times reading from a fixed location on the disk: longer read times mean a 1 is set, shorter a 0
• 3 m1.small EC2 accounts: control, victim, probe
– 2 control instances in each of 3 availability zones, 20 victim and 20 probe instances in Zone 3
• Determine Dom0 address for each instance
• For each ordered pair (A, B) of 40 instances, perform co-residency checks
• After 3 independent trials, 31 (potentially) co-resident pairs have been identified - 62 ordered
pairs
• 5 bit message from A to B was successfully sent for 60 out of 62 ordered pairs
17
Effective Co-residency Check
• For checking co-residence with target instances:
– Compare internal IP addresses to see if they are close
– If yes, perform a TCP SYN traceroute to an open port on the target and see if
there is only a single hop (Dom0 IP)
• Check requires sending (at most) two TCP SYN packets
– No full TCP connection is established
• Very “quiet” check (little communication with the victim)
18
Q3: Causing Co-residence
• Two strategies to achieve “good” coverage (co-residence with a
good fraction of target set)
– Brute-force placement:
• run numerous probe instances over a long period of time and see how many targets one can
achieve co-residence with.
• For co-residency check, the probe performed a wget on port 80 to ensure the target was still
serving web pages
• Of the 1686 target victims, the brute-force probes achieved co-residency with 141 victim
servers (8.4% coverage)
• Even a naïve strategy can successfully achieve co-residence against a not-so-small fraction of
targets
– Target recently launched instances:
• take advantage of the tendency of EC2 to assign fresh instances to small set of machines
19
Leveraging Placement Locality
• Placement locality
– Instances launched simultaneously from same account do not run on the same physical machine
– Sequential placement locality: exists when two instances run sequentially (the first terminated before
launching the second) are often assigned to the same machine
– Parallel placement locality: exists when two instances run (from distinct accounts) at roughly the same time
are often assigned to the same machine.
• Instance flooding: launch lots of instances in parallel in the appropriate availability zone
and of the appropriate type
20
Leveraging Placement Locality (contd…)
• Experiment
– Single victim instance is launched
– Attacker launches 20 instances within 5 minutes
– Perform co-residence check
– 40% of the time the attacker launching just 20 probes achieves co-residence against a specific target
instance
21
Q4: Exploiting Co-residence
• Cross-VM attacks can allow for information leakage
• How can we exploit the shared infrastructure?
– Gain information about the resource usage of other instances
– Create and use covert channels to intentionally leak information from one instance to
another
– Some applications of this covert channel are:
• Co-residence detection
• Surreptitious detection of the rate of web traffic a co-resident site receives
• Timing keystrokes by an honest user of a co-resident instance
22
Exploiting Co-residence (contd…)
• Measuring cache usage
– Time-shared cache allows an attacker to measure when other instances are
experiencing computational load
– Load measurement: allocate a contiguous buffer B of b bytes, s is cache line size (in
bytes)
• Prime: read B at s-byte offsets in order to ensure that it is cached.
• Trigger: busy-loop until CPU’s cycle counter jumps by a large value
• Probe: measure the time it takes to again read B at s-byte offset
– Cache-based covert channel:
• Sender idles to transmit a 0 and frantically accesses memory to transmit a 1
• Receiver accesses a memory block and observes the access latencies
• High latencies are indicative that “1” is transmitted
23
Exploiting Co-residence
Load-based co-residence check
(contd…)
Co-residence check can be done without network- base technique
Adversary can actively cause load variation due to a publicly-accessible service running on the target
Use a priori knowledge about load variation
Induce computational load (lots of HTTP requests) and observe the differences in load samples
• Instances in Trial 1 and Trial 2 were co-resident on distinct physical machines; instances in Trial 3 were not co-
resident
24
Exploiting Co-residence (contd…)
• Estimating traffic rates
– Load measurement might provide a method for estimating the number of visitors to a co-resident web
server
– It might not be a public information and could be damaging
– Perform 1000 cache load measurements in which
• no HTTP requests are sent
• HTTP requests sent at a rate of (i) 50 per minute, (ii) 100 per minute, (iii) 200 per minutes
25
Exploiting Co-residence (contd…)
• Keystroke timing attack
– The goal is to measure the time between keystrokes made by a victim typing a
password (or other sensitive information)
– Malicious VM can observe keystroke timing in real time via cache-based load
measurements
– Inter-keystroke times if properly measures can be used to perform recovery of the
password
– In an otherwise idle machine, a spike in load corresponds to a letter being typed
into the co-resident VM’s terminal
– Attacker does not directly learn exactly which keys are pressed, the attained
timing resolution suffices to conduct the password-recovery attacks on SSH
sessions
26
Preventive Measures
• Mapping
– Use a randomized scheme to allocate IP addresses
– Block some tools (nmap, traceroute)
• Co-residence checks
– Prevent identification of Dom0
• Co-location
– Not allow co-residence at all
• Beneficial for cloud user
• Not efficient for cloud provider
• Information leakage via side-channel
– No solution
27
Summary
• New risks from cloud computing
• Shared physical infrastructure may and most likely will cause
problems
– Exploiting software vulnerabilities not addressed here
28
Thank You!
29
29
CLOUD COMPUTING
CLOUD SECURITY IV
Security Issues in Collaborative SaaS Cloud
2
Security Responsibilities
3
SaaS Cloud-based Collaboration
• APIs for sharing resources/information
– Service consumer(customers): human users, applications, organizations/domains,
etc.
– Service provider: SaaS cloud vendor
• SaaS cloud-centric collaboration: valuable and essential
– Data sharing
– Problems handled: inter-disciplinary approach
• Common concerns:
– Integrity of data, shared across multiple users, may be compromised
– Choosing an “ideal” vendor
Nirnay Ghosh, Securing Loosely-coupled Collaborations in a SaaS
Cloud through Risk Estimation and Access Conflict Mediation, PhD
Thesis, IIT Kharagpur, 2016 4
SaaS Cloud-based Collaboration
• Types of collaboration in multi-domain/cloud systems:
– Tightly-coupled or federated
– Loosely-coupled
• Challenges: securing loosely-coupled collaborations in cloud
environment
– Security mechanisms: mainly proposed for tightly-coupled
systems
– Restrictions in the existing authentication/authorization
mechanisms in clouds
5
Motivations and Challenges
• SaaS cloud delivery model: maximum lack of control
• No active data streams/audit trails/outage report
– Security: Major concern in the usage of cloud services
• Broad scope: address security issues in SaaS clouds
• Cloud marketplace: rapid growth due to recent advancements
• Availability of multiple service providers
– Choosing SPs from SLA guarantees: not reliable
• Inconsistency in service level guarantees
• Non-standard clauses and technical specifications
• Focus: selecting an “ideal” SaaS cloud provider and address the security issues
6
Motivations and Challenges
• Online collaboration: popular
• Security issue: unauthorized disclosure of sensitive information
– Focus: selecting an ideal SaaS cloud provider and secure the
collaboration service offered by it
• Relevance in today’s context: loosely-coupled collaboration
– Dynamic data/information sharing
• Final goal (problem statement): selecting an ideal SaaS cloud
provider and securing the loosely-coupled collaboration in its
environment
7
Objective - I
A framework (SelCSP) for selecting a trustworthy and competent
collaboration service provider.
9/20/2017
Objective - II
Select requests (for accessing local resources) from anonymous users, such that
both access risk and security uncertainty due to information sharing are kept low.
9/20/2017
Objective - III
Formulate a heuristic for solving the IDRM problem, such that minimal
excess privilege is granted
9/20/2017
Objective - IV
A distributed secure collaboration framework,
which uses only local information to dynamically
detect and remove access conflicts.
9/20/2017
Selection of Trustworthy and Competent
SaaS Cloud Provider for Collaboration
9/20/2017
Trust Models in Cloud
• Challenges
– Most of the reported works have not presented mathematical
formulation or validation of their trust and risk models
– Web service selection [Liu’04][Garg’13] based on QoS and trust are
available
• Select resources (e.g. services, products, etc.) by modeling their
performance
• Objective: Model trust/reputation/competence of service provider
9/20/2017
Service Level Agreement (SLA) for Clouds
• Challenges:
– Majority of the cloud providers guarantee “availability” of services
– Consumers not only demand availability guarantee but also other
performance related assurances which are equally business critical
– Present day cloud SLAs contain non-standard clauses regarding assurances
and compensations following a violation[Habib’11]
• Objective: Establish a standard set of parameters for cloud SLAs, since it
reduces the perception of risk in outsourced services
9/20/2017
SelCSP Framework
9/20/2017
SelCSP Framework - Overview
Recommending Access Requests from
Anonymous Users for Authorization
Risk-based Access Control (RAC)
• RAC: Gives access to subjects even though they lack proper permissions
– Goal: balance between access risk and security uncertainty due to information
sharing
– Flexible compared to binary MLS
• Challenges
– Computing security uncertainty: not addressed
– Authorization in existing RAC system: based on risk threshold and operational
need.
• Operational need: not quantified.
• Discards many requests which potentially maximizes information sharing
9/20/2017
Distributed RAC using Fuzzy Inference System
9/20/2017
Mapping of Authorized Permissions into
Local Roles
9/20/2017
Inter-Domain Role Mapping (IDRM)
• Finds a minimal set of role which encompasses the requested permission set.
– No polynomial time solution
– Greedy search-based heuristics: suboptimal solutions
• Challenges:
– There may exist multiple minimal role sets
– There may not exist any role set which exactly maps all permissions
• Two variants of IDRM proposed: IDRM-safety, IDRM-availability
• Objective: formulate a novel heuristic to generate better solution for the IDRM-
availability problem.
• Minimize the number of additional permissions
9/20/2017
Distributed Role Mapping Framework
9/20/2017
Distributed Role Mapping Framework
Dynamic Detection and Removal of
Access Policy Conflicts
Access Conflicts
9/20/2017
Objective
• Dynamic detection of conflicts to address security issue
• Removal of conflicts to address availability issue
• Proposed: distributed secure collaboration framework
Role Sequence Generation
Interoperation request:
Distributed Secure pair of entry (from
Collaboration requesting domain), exit
Framework (from providing
domain) roles
Role sequence: ordered
succession of entry and
exit roles
Role cycle:
Safe role cycle
Unsafe role cycle
9/20/2017
Conflict Detection
• Detection of inheritance conflict
– Necessary condition: at least one exit role
– Sufficient condition: current entry role is senior to at least one exit role
• Detection of SoD constraint violation
– Necessary condition: at least one exit role
– Sufficient condition: current entry role and at least one exit role forms
conflicting pair
9/20/2017
Conflict Removal
Cyclic Inheritance: Inheritance Conflict Removal Rule for Exactly Matched Role
9/20/2017
Conflict Removal
Cyclic Inheritance: Inheritance Conflict Removal Rule for No-Exactly Matched Role
9/20/2017
Conflict Removal
SoD Constraint Violation
• Two cases: similar to removal of inheritance conflict
– Additional constraint: identifying conflicting permission between
collaborating role and entry role in current domain
– Conflicting permission
• Objects are similar
• Hierarchical relation exists between access modes
9/20/2017
Conflict Removal
SoD Constraint Violation: SoD Conflict Removal Rule for Exactly Matched Role
9/20/2017
Conflict Removal
SoD Constraint Violation: SoD Conflict Removal Rule for No-Exactly Matched Role
9/20/2017
Summary
Secure Collaboration SaaS Clouds: A Typical Approach
• Selection of Trustworthy and Competent SaaS Cloud Provider
for Collaboration
• Recommending Access Requests from Anonymous Users for
Authorization
• Mapping of Authorized Permissions into Local Roles
• Dynamic Detection and Removal of Access Policy Conflicts
33
Thank You!
34
34
Cloud Computing :
Broker for Cloud Marketplace
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
9/20/2017 1
INTRODUCTION
• Rapid growth of available cloud services
• Huge number of providers with varying QoS
• Different types of customer use cases – each
with different requirements
9/20/2017 2
INTRODUCTION
• Rapid growth of available cloud services
• Huge number of providers with varying QoS
• Different types of customer use cases – each with different requirements
9/20/2017 3
MOTIVATION
• Trustworthiness of provider
• Monitoring of services
9/20/2017 4
OBJECTIVES
• Selection of the most suitable provider satisfying customer's QoS
requirements
9/20/2017 5
Different Approaches
• CloudCmp: a tool that compares cloud providers in order to
measure the QoS they offer and helps users to select a cloud.
• Fuzzy provider selection mechanism.
• Framework with a measure of satisfaction with a provider for
keeping in mind the fuzzy nature of the user requirements.
• Provider selection framework which takes into account the
trustworthiness and competence of a provider.
9/20/2017 6
CUSTOMER QoS PARAMETERS
Infrastructure-as-a-Service
Software-as-a-Service
9/20/2017 7
PROVIDER
• Promised QoS values :
• Trust values :
Note: They have been kept independent as they pertain to different parameters
9/20/2017 8
Typical MARKETPLACE Architecture
9/20/2017 9
PROVIDER SELECTION
• Selection of provider is done using a fuzzy inference engine
• Input : QoS offered by a provider and its trustworthiness
• Output : Suitability of the provider for the customer
• Customer request is dispatched to provider with maximum suitability
• Membership functions are built using the user requirements
9/20/2017 10
PROVIDER SELECTION
9/20/2017 11
PROVIDER SELECTION – INPUT MEMBERSHIP FUNCTION
9/20/2017 12
PROVIDER SELECTION – INPUT MEMBERSHIP FUNCTION
9/20/2017 13
PROVIDER SELECTION – OUTPUT MEMBERSHIP FUNCTION
9/20/2017 14
MONITORING MODULE
From
provider/3rd
party
Performance for SIi in
monitoring tool
current monitoring
period
From
Repository
9/20/2017 15
MIGRATION DECIDER
• Makes use of a fuzzy inference engine
• Input :
9/20/2017 16
MIGRATION DECIDER – OUTPUT MEMBERSHIP FUNCTION
9/20/2017 17
MIGRATION MODULE - SELECTION OF TARGET PROVIDER
9/20/2017 18
Case study on IaaS Marketplace
9/20/2017 19
EXPERIMENTS AND RESULTS
9/20/2017 20
EXPERIMENTS AND RESULTS
9/20/2017 21
EXPERIMENTS AND RESULTS
9/20/2017 22
Case study on SaaS Marketplace
9/20/2017 23
EXPERIMENTS AND RESULTS
9/20/2017 24
Experiments and Results
9/20/2017 25
EXPERIMENTS AND RESULTS
9/20/2017 26
Future Scope
• Specification of flexibility in QoS requirements
9/20/2017 27
9/20/2017 28
CLOUD COMPUTING
Mobile Cloud Computing - I
1
Motivation
Growth in the use of Smart phones, apps
Increased capabilities of mobile devices
Access of internet using Mobile devices than PCs!
http://www.rapidvaluesolutions.com/whitepapers/How-MBaaS-is-Shaping-up-Enterprise-Mobility-Space.html
3
Augmenting Mobiles with Cloud Computing
Amazon Silk browser
Split browser
Apple Siri
Speech recognition in cloud
Apple iCloud
Unlimited storage and sync capabilities
Image recognition apps on smart-phones useful in developing augmented
reality apps on mobile devices
Augmented reality app using Google Glass
4
What is Mobile Cloud Computing?
Mobile cloud computing (MCC) is the combination of cloud computing, mobile computing
and wireless networks to bring rich computational resources to mobile users.
MCC provides mobile users with data storage and processing services in clouds
Obviating the need to have a powerful device configuration (e.g. CPU speed, memory
capacity etc.)
Mobile Cloud computing
All resource-intensive is the
computing can combination
be performed in theof cloud computing and
cloud
mobile networks to bring benefits for mobile users, network
Moving computing power and data
operators, asstorage away from
well as cloud the mobile devices
providers
Powerful and centralized computing platforms located in clouds
Accessed over the wireless connection based on a thin native client
https://www.ibm.com/cloud-computing/learn-more/what-is-mobile-
cloud-computing/
5
Why Mobile Cloud Computing?
Speed and flexibility
Mobile cloud applications can be built or revised quickly using cloud services. They can be delivered
to many different devices with different operating systems
Shared resources
Mobile apps that run on the cloud are not constrained by a device's storage and processing
resources. Data-intensive processes can run in the cloud. User engagement can continue seamlessly
from one device to another.
Integrated data
Mobile cloud computing enables users to quickly and securely collect and integrate data from
various sources, regardless of where it resides.
6
Key-features of Mobile Cloud Computing
Mobile cloud computing delivers applications to mobile devices quickly and securely, with capabilities
beyond those of local resources
7
Mobile Cloud Computing
Pros Cons
Saves battery power Must send the program states (data) to
the cloud server, hence consumes battery
Makes execution faster Network latency can lead to execution
delay
10
MCC key components
Profiler
Profiler monitors application execution to collect data about the time to
execute, power consumption, network traffic
Solver
Solver has the task of selecting which parts of an app runs on mobile and
cloud
Synchronizer
Task of synchronizer modules is to collect results of split execution and
combine, and make the execution details transparent to the user
11
Key Requirements for MCC
12
Mobile Cloud Computing – Typical Architecture
Dinh, Hoang T., et al. "A survey of mobile cloud computing: architecture,
applications, and approaches." Wireless communications and mobile
13
computing 13.18 (2013): 1587-1611.
Mobile Cloud Computing - Architecture
14
Mobile Cloud Computing - Architecture
15
Mobile Cloud Computing - Architecture
16
Advantages of MCC
Extending battery lifetime
– Computation offloading migrates large computations and complex processing from resource-
limited devices (i.e., mobile devices) to resourceful machines (i.e., servers in clouds).
– Remote application execution can save energy significantly.
– Many mobile applications take advantages from task migration and remote processing
Improving data storage capacity and processing power
– MCC enables mobile users to store/access large data on the cloud.
– MCC helps reduce the running cost for computation intensive applications.
– Mobile applications are not constrained by storage capacity on the devices because their data now is
stored on the cloud
17
Advantages of MCC (contd…)
Improving Reliability and Availability
– Keeping data and application in the clouds reduces the chance of lost on the mobile
devices.
– MCC can be designed as a comprehensive data security model for both service providers
and users:
• Protect copyrighted digital contents in clouds.
• Provide security services such as virus scanning, malicious code detection, authentication for
mobile users.
– With data and services in the clouds, then are always(almost) available even when the users
are moving .
18
Advantages of MCC
• Dynamic provisioning
• Scalability
• Multi-tenancy
– Service providers can share the resources and costs to support a variety of
applications and large no. of users.
• Ease of Integration
– Multiple services from different providers can be integrated easily through the cloud and
the Internet to meet the users’ demands.
19
Mobile Cloud Computing – Challenges
20
Mobile Cloud Computing – Challenges
21
Mobile Cloud Computing – Challenges
Security for Mobile Users
Approaches to move the threat detection capabilities to clouds.
Host agent runs on mobile devices to inspect the file activity on a system. If an identified file is
not available in a cache of previous analyzed files, this file will be sent to the in cloud network
service for verification.
The smartphone records only a minimal execution trace, and transmits it to the security
server in the cloud.
22
Mobile Cloud Computing – Challenges
• Context-aware mobile cloud services try to utilize the local contexts (e.g., data types, network
status, device environments, and user preferences) to improve the quality of service (QoS).
H. H. La and S. D. Kim, “A Conceptual Framework for Provisioning Context-aware Mobile Cloud Services”,
in Proceedings of IEEE International Conference on Cloud Computing (CLOUD), pp. 466, August 2010.
23
Mobile Cloud Computing – Challenges
Network Access Management:
– An efficient network access management not only improves link performance but also optimizes
bandwidth usage
Quality of Service:
– How to ensure QoS is still a big issue, especially on network delay.
– CloneCloud and Cloudlets are expected to reduce the network delay.
– The idea is to clone the entire set of data and applications from the smartphone onto the cloud and to
selectively execute some operations on the clones, reintegrating the results back into the smartphone
Pricing:
– MCC involves both mobile service provider (MSP) and cloud service provider (CSP) with different
services management, customers management, methods of payment and prices.
– Business model including pricing and revenue sharing has to be carefully developed for MCC.
24
Mobile Cloud Computing – Challenges
Standard Interface:
– Interoperability becomes an important issue when mobile users need to interact with the cloud.
– Compatibility among devices for web interface could be an issue.
– Standard protocol, signaling, and interface between mobile users and cloud would be required.
Service Convergence:
– Services will be differentiated according to the types, cost, availability and quality.
– New scheme is needed in which the mobile users can utilize multiple cloud in a unified fashion.
– Automatic discover and compose services for user.
– Sky computing is a model where resources from multiple clouds providers are leveraged to create
a large scale distributed infrastructure.
– Service integration (i.e., convergence) would need to be explored.
25
Key challenges
MCC requires dynamic partitioning of an application to optimize
Energy saving
Execution time
Requires a software (middleware) that decides at app launch which parts
of the application must execute on the mobile device, and which parts
must execute on cloud
A classic optimization problem
26
MCC Systems: MAUI (Mobile Assistance Using Infrastructure)
• MAUI enables the programmer to produce an whether a method should be executed on cloud
initial partition of the program server to save energy and time to execute
– Programmer marks each method as “remoteable” • MAUI server is the cloud component. The
or not framework has the necessary software modules
– Native methods cannot be remoteable required in the workflow.
• MAUI framework uses the annotation to decide
29
Task Partitioning Problem in MCC
Input:
• A call graph representing an application’s method call sequence
• Attributes for each node in the graph denotes
(a) energy consumed to execute the method on the mobile device,
(b) energy consumed to transfer the program states to a remote server
Output:
• Partition the methods into two sets – one set marks the methods to execute on the mobile
device, and the second set marks the methods to execute on cloud Goals and Constraints:
1. Energy consumed must be minimized
2. There is a limit on the execution time of the application
3. Other constraints could be – some methods must be executed on mobile device, total monetary
cost, etc.
30
Mathematical Formulation
Highlighted nodes must be executed on the
mobile device -> called native tasks (v1, v4, v9)
Edges represent the sequence of execution - Any
non-highlighted node can be executed either
locally on the mobile device or on cloud
Directed Acyclic Graph represents an application Call Graph
• 0-1 integer linear program,
where Iv = 0 if method executed locally,
= 1 if method executed remotely
• E : Energy cost to execute method v locally
• Cu,v : Cost of data transfer
• L : Total execution latency
• T : Time to execute the method
• B : Time to transfer program state
31
Mathematical Formulation (Contd..)
• Static Partitioning
– When an application is launched, invoke an ILP solver which will tell where each
method should be executed
– There are also heuristics to find solutions faster
• Dynamic or Adaptive Partitioning
– For a long running program, the environmental conditions can vary
– Depending on the input, the energy consumption of a method can vary
32
Mobile Cloud Computing – Challenges/ Issues
Mobile communication issues
Low bandwidth: One of the biggest issues, because the radio resource for wireless networks
is much more scarce than wired networks
Service availability: Mobile users may not be able to connect to the cloud to obtain a service
due to traffic congestion, network failures, mobile signal strength problems
Heterogeneity: Handling wireless connectivity with highly heterogeneous networks to satisfy
MCC requirements (always-on connectivity, on-demand scalability, energy efficiency) is a
difficult problem
Computing issues (Computation offloading)
One of the main features of MCC
Offloading is not always effective in saving energy
It is critical to determine whether to offload and which portions of the service codes to
offload
33
CODE OFFLOADING USING CLOUDLET
CLOUDLET:
“a trusted, resource-rich computer or cluster of computers that is well-connected
to the Internet and is available for use by nearby mobile devices.”
Code Offloading :
Offloading the code to the remote server and executing it.
This architecture decreases latency by using a single-hop network and potentially
lowers battery consumption by using Wi-Fi or short-range radio instead of
broadband wireless which typically consumes more energy.
34
CODE OFFLOADING USING CLOUDLET
Cloudlet Application Overlay Creation Process
35
CODE OFFLOADING USING CLOUDLET
Cloudlet Application Overlay Creation Process
36
The amount of energy saved is :
When to Offload??
𝑪 𝑪 𝑫
𝑷𝒄 × − Pi × − Ptr×
𝑴 𝑺 𝑩
S: the speed of cloud to compute C instructions Suppose the server is F times faster—that is, S= F × M.
M: the speed of mobile to compute C instructions
D: the data need to transmit
B: the bandwidth of the wireless Internet We can rewrite the formula as
Pc: the energy cost per second when the mobile 𝑪 𝑷 𝑫
phone is doing computing
× (𝑷𝒄 − 𝑭𝒊) −Ptr×
𝑴 𝑩
Pi: the energy cost per second when the mobile
phone is idle.
Ptr: the energy cost per second when the mobile is
transmission the data.
37
When to Offload? (contd..)
Energy is saved when the formula produces a positive number. The
formula is positive if D/B is sufficiently small compared with C/M and
The amount of energy saved is :
F is sufficiently large. 𝑪 𝑪 𝑫
Cloud computing can potentially save energy for mobile users. 𝑷𝒄 × − Pi × − Ptr×
𝑴 𝑺 𝑩
Not all applications are energy efficient when migrated to the cloud.
Cloud computing services would be significantly different from cloud
We can rewrite the formula as
𝑪 𝑷 𝑫
services for desktops because they must offer energy savings. × (𝑷𝒄 − 𝑭𝒊) −Ptr×
𝑴 𝑩
The services should consider the energy overhead for privacy,
security, reliability, and data communication before offloading.
38
When to Offload?? (contd..)
39
Computation Offloading Approaches
Partition a program based on estimation of energy consumption before execution
Z. Li, C. Wang, and R. Xu, “Computation offloading to save energy on handheld devices: a partition scheme,” in Proc 2001 Intl Conf on
Compilers, architecture, and synthesis for embedded systems (CASES), pp. 238-246, Nov 2001.
K. Kumar and Y. Lu, “Cloud Computing for Mobile Users: Can Offloading Computation Save Energy,” IEEE Computer, vol. 43, no. 4, April 2010
40
How to evaluate MCC performance
• Energy Consumption
– Must reduce energy usage and extend battery life
• Time to Completion
– Should not take longer to finish the application compared to local execution
• Monetary Cost
– Cost of network usage and server usage must be optimized
• Security
– As offloading transfers data to the servers, ensure confidentiality and privacy of
data, how to identify methods which process confidential data
41
Open Questions?
• How can one design a practical and usable MCC framework
– System as well as partitioning algorithm
• Is there a scalable algorithm for partitioning
– Optimization formulations are NP-hard
– Heuristics fail to give any performance guarantee
• Which are the most relevant parameters to consider in the
design of MCC systems?
42
Mobile Cloud Computing – Applications?
Mobile Health-care
Health-Monitoring services, Intelligent emergency
management system, Health-aware mobile devices (detect
pulse rate, blood pressure, alcohol-level etc.)
Mobile Gaming
It can completely offload game engine requiring large
computing resource (e.g., graphic rendering) to the server in
the cloud
Mobile Commerce
M-commerce allows business models for commerce using
mobile (Mobile financial, mobile advertising, mobile shopping)
43
Mobile Cloud Computing – Applications?
Mobile Learning
M-learning combines e-learning and mobility
Pedestrian crossing guide for blind Traditional m-learning has limitations on high cost of
and visually-impaired devices/network, low transmission rate, limited
educational resources
Mobile currency reader for blind Cloud-based m-learning can solve these limitations
and visually impaired Enhanced communication quality between students
Lecture transcription for hearing and teachers
impaired students Help learners access remote learning resources
A natural environment for collaborative learning
Assistive Technologies
44
MuSIC: Mobility-Aware Optimal Service Allocation in Mobile Cloud Computing
User Mobility introduces new complexities in enabling an optimal decomposition of tasks that can
execute cooperatively on mobile clients and the tiered cloud architecture while considering multiple
QoS goals such application delay, device power consumption and user cost/price.
Apart from scalability and access issues with the increased number of users, mobile applications are
faced with increased latencies and reduced reliability
As a user moves, the physical distance between the user and the cloud resources originally
provisioned changes causing additional delays
Further, the lack of effective handoff mechanisms in WiFi networks as user move rapidly causes an
increase in the number of packet losses
In other words, user mobility, if not addressed properly, can result in suboptimal resource mapping
choices and ultimately in diminished application QoS
Efficient techniques for dynamic mapping of resources in the presence of mobility; using
a tiered cloud architecture, to meet the multidimensional QoS needs of mobile users
Location-time workflow (LTW) as the modeling framework to model mobile applications and
capture user mobility. Within this framework, mobile service usage patterns as a function of
location and time has been formally modelled
Given a mobile application execution expressed as a LTW, the framework optimally partitions
the execution of the location-time workflow in the 2-tier architecture based on a utility metric
that combines service price, power consumption and delay of the mobile applications
Location Map:
It is a partition of the 2-D space/region Mobile User Trajectory:
in which mobile hosts and cloud resources are located
The trajectory of a mobile user, uk, is represented as a list of tuples of the
User Service Set: form {(1; lk); …; (n; lm)} where (i; lj) implies that the mobile user is in
The set of all services that a user location lj for time duration I
has on his own device (e.g. decoders, image editors
etc.) Center of Mobility:
It is the location where (or near where) a mobile user uk spends most of its time
Combination of the mobile application workflow concept with a user trajectory to model the mobile users
and the requested services in their trajectory.
Location-Time Workflow
As the number of vehicles increases, there is an increasing trend of insufficient parking spaces in many large
cities, and this problem is gradually getting worse
With the proliferation of wireless sensor networks (WSNs) and cloud computing, there exists strong potential
to alleviate this problem using context information (e.g., road conditions and status of parking garages) to
provide context-aware dynamic parking services
Cloud Assisted parking services (traditional parking garages and dynamic parking services along the road) and
parking reservation service using smart terminals such as smartphones.
As the number of vehicles increases, there is an increasing trend of insufficient parking spaces in many large
cities, and this problem is gradually getting worse
With the proliferation of wireless sensor networks (WSNs) and cloud computing, there exists strong potential
to alleviate this problem using context information (e.g., road conditions and status of parking garages) to
provide context-aware dynamic parking services
Cloud Assisted parking services (traditional parking garages and dynamic parking services along the road) and
parking reservation service using smart terminals such as smartphones.
53
Traditional parking garages:
The context information of each
parking space detected by a WSN
is forwarded to the traffic cloud
by WSNs, third-generation (3G)
communications, and the
Internet.
The collected data are processed
in the cloud and then selectively
transmitted to the users.
This is helpful for providing more
convenience services and
evaluating the utilization levels of
the parking garage.
Also, the status of the parking
garage may be dynamically
published on a nearby billboard
to users who have no ability to
get the status by smart terminals.
55
A Case Study: Context Aware Dynamic Parking Service
Three aspects, including service planning of traffic authorities, reservation service process, and context-aware
optimization have been studied.
In order to make an effective prediction, researchers on vehicular social networks carry out traffic data
mining to discover useful information and knowledge from collected big data. The prediction process
depends on classifying the influence factors and designing a decision tree
By the method of probability analysis, the traffic authorities dynamically arrange whether the road
can be authorized to provide context-aware parking services. In some particular cases, a fatal factor
may directly affect the decision making. For example, when a typhoon is approaching, traffic
authorities may immediately terminate services
56
A Case Study: Context Aware Dynamic Parking Service
Parking reservation services:
The status of a parking space can be monitored as determined by the corresponding system, and
subsequently updated in the traffic cloud.
The drivers or passengers can quickly obtain the parking space’s information by various smart
terminals such as smartphones. If a proper parking space cannot be found, further search scope is
extended.
Within a given time, we may log into the traffic cloud and subscribe to a parking space.
57
A Case Study: Context Aware Dynamic Parking Service
58
A Case Study: Context Aware Dynamic Parking Service
Context-aware optimization:
The context information includes not only road conditions and the status of the parking garage, but
also the expected duration of parking as well.
Since the purpose of a visit to the place in question can determine the expected duration of parking,
this context information can be used to optimize the best parking locations for drivers.
For the parked vehicles, the expected duration of parking can be uploaded to the traffic cloud and
shared with potential drivers after analysis.
In this way, even when the parking garage has no empty parking spaces available, drivers still can
inquire as to the status of the parking garage and get the desired service by context-aware
optimization.
The proposed context-aware dynamic parking service is a promising solution for alleviating parking
difficulties and improving the QoS of CVC. Many technologies such as WSNs, traffic clouds, and
traffic data mining are enabling this application scenario to become a reality
59
Summary
Mobile cloud computing is one of the mobile technology trends in the future
because it combines the advantages of both MC and CC, thereby providing optimal
services for mobile users
MCC focuses more on user experience : Lower battery consumption , Faster
application execution
MCC architectures design the middleware to partition an application execution
transparently between mobile device and cloud servers
The applications supported by MCC including m-commerce, mlearning, and mobile
healthcare show the applicability of the MCC to a wide range.
The issues and challenges for MCC (i.e., from communication and computing sides)
demonstrates future research avenues and directions.
60
References
Dinh, Hoang T., et al. “A survey of mobile cloud computing: architecture, applications, and approaches.” Wireless communications and mobile
computing 13.18 (2013): 1587-1611
Z. Li, C. Wang, and R. Xu, “Computation offloading to save energy on handheld devices: a partition scheme,” in Proc 2001 Intl Conf on
Compilers, architecture, and synthesis for embedded systems (CASES), pp. 238-246, Nov 2001.
K. Kumar and Y. Lu, “Cloud Computing for Mobile Users: Can Offloading Computation Save Energy,” IEEE Computer, vol. 43, no. 4, April 2010
H. H. La and S. D. Kim, “A Conceptual Framework for Provisioning Context-aware Mobile Cloud Services,” in Proceedings of IEEE International
Conference on Cloud Computing (CLOUD), pp. 466, August 2010
Gordon, Mark S., et al. "COMET: Code Offload by Migrating Execution Transparently." OSDI. 2012.
Yang, Seungjun, et al. "Fast dynamic execution offloading for efficient mobile cloud computing." Pervasive Computing and Communications
(PerCom), 2013 IEEE International Conference on. IEEE, 2013
Shiraz, Muhammad, et al. "A review on distributed application processing frameworks in smart mobile devices for mobile cloud
computing."Communications Surveys & Tutorials, IEEE 15.3 (2013): 1294-1313
https://www.ibm.com/cloud-computing/learn-more/what-is-mobile-cloud-computing/
61
62
CLOUD COMPUTING
Mobile Cloud Computing - II
1
Mobile Cloud Computing (MCC) - Key challenges
MCC requires dynamic partitioning of an application to optimize
Energy saving
Execution time
Requires a software (middleware) that decides at app launch which parts
of the application must execute on the mobile device, and which parts
must execute on cloud
A classic optimization problem
2
MCC Systems: MAUI (Mobile Assistance Using Infrastructure)
• MAUI enables the programmer to produce an whether a method should be executed on cloud
initial partition of the program server to save energy and time to execute
– Programmer marks each method as “remoteable” • MAUI server is the cloud component. The
or not framework has the necessary software modules
– Native methods cannot be remoteable required in the workflow.
• MAUI framework uses the annotation to decide
5
Task Partitioning Problem in MCC
Input:
• A call graph representing an application’s method call sequence
• Attributes for each node in the graph denotes
(a) energy consumed to execute the method on the mobile device,
(b) energy consumed to transfer the program states to a remote server
Output:
• Partition the methods into two sets – one set marks the methods to execute on the mobile
device, and the second set marks the methods to execute on cloud
Goals and Constraints:
1. Energy consumed must be minimized
2. There is a limit on the execution time of the application
3. Other constraints could be – some methods must be executed on mobile device, total monetary
cost, etc.
6
Mathematical Formulation
Highlighted nodes must be executed on the
mobile device -> called native tasks (v1, v4, v9)
Edges represent the sequence of execution - Any
non-highlighted node can be executed either
locally on the mobile device or on cloud
Directed Acyclic Graph represents an application Call Graph
• 0-1 integer linear program,
where Iv = 0 if method executed locally,
= 1 if method executed remotely
• E : Energy cost to execute method v locally
• Cu,v : Cost of data transfer
• L : Total execution latency
• T : Time to execute the method
• B : Time to transfer program state
7
Static and Dynamic Partitioning
• Static Partitioning
– When an application is launched, invoke an ILP solver which will tell where each
method should be executed
– There are also heuristics to find solutions faster
• Dynamic or Adaptive Partitioning
– For a long running program, the environmental conditions can vary
– Depending on the input, the energy consumption of a method can vary
8
Mobile Cloud Computing – Challenges/ Issues
Mobile communication issues
Low bandwidth: One of the biggest issues, because the radio resource for wireless networks
is much more scarce than wired networks
Service availability: Mobile users may not be able to connect to the cloud to obtain a service
due to traffic congestion, network failures, mobile signal strength problems
Heterogeneity: Handling wireless connectivity with highly heterogeneous networks to satisfy
MCC requirements (always-on connectivity, on-demand scalability, energy efficiency) is a
difficult problem
Computing issues (Computation offloading)
One of the main features of MCC
Offloading is not always effective in saving energy
It is critical to determine whether to offload and which portions of the service codes to
offload
9
CODE OFFLOADING USING CLOUDLET
CLOUDLET:
“a trusted, resource-rich computer or cluster of computers that is well-connected
to the Internet and is available for use by nearby mobile devices.”
Code Offloading :
Offloading the code to the remote server and executing it.
This architecture decreases latency by using a single-hop network and potentially
lowers battery consumption by using Wi-Fi or short-range radio instead of
broadband wireless which typically consumes more energy.
10
CODE OFFLOADING USING CLOUDLET
Cloudlet
11
When to Offload ?
Amount of energy saved is :
𝑪 𝑪 𝑫
𝑷𝒄 × − Pi × − Ptr×
𝑴 𝑺 𝑩
12
When to Offload? (contd..)
Energy is saved when the formula produces a positive number.
The formula is positive if D/B is sufficiently small compared with The amount of energy saved is :
C/M and F is sufficiently large. 𝑪 𝑪 𝑫
Cloud computing can potentially save energy for mobile users.
𝑷𝒄 × − Pi × − Ptr×
𝑴 𝑺 𝑩
Not all applications are energy efficient when migrated to the We can rewrite the formula as
cloud. 𝑪 𝑷 𝑫
Cloud computing services would be significantly different from
× (𝑷𝒄 − 𝑭𝒊) −Ptr×
𝑴 𝑩
cloud services for desktops because they must offer energy
savings.
The services should consider the energy overhead for privacy,
security, reliability, and data communication before offloading.
13
When to Offload?? (contd..)
14
Computation Offloading Approaches
Partition a program based on estimation of energy consumption before execution
Z. Li, C. Wang, and R. Xu, “Computation offloading to save energy on handheld devices: a partition scheme,” in Proc 2001 Intl Conf on
Compilers, architecture, and synthesis for embedded systems (CASES), pp. 238-246, Nov 2001.
K. Kumar and Y. Lu, “Cloud Computing for Mobile Users: Can Offloading Computation Save Energy,” IEEE Computer, vol. 43, no. 4, April 2010
15
How to evaluate MCC performance
• Energy Consumption
– Must reduce energy usage and extend battery life
• Time to Completion
– Should not take longer to finish the application compared to local execution
• Monetary Cost
– Cost of network usage and server usage must be optimized
• • Security
– As offloading transfers data to the servers, ensure confidentiality and privacy
of data, how to identify methods which process confidential data
16
Challenges
• How can one design a practical and usable MCC framework
– System as well as partitioning algorithm
• Is there a scalable algorithm for partitioning
– Optimization formulations are NP-hard
– Heuristics fail to give any performance guarantee
• Which are the most relevant parameters to consider in the
design of MCC systems?
17
Mobile Cloud Computing – Applications?
Mobile Health-care
Health-Monitoring services, Intelligent emergency
management system, Health-aware mobile devices (detect
pulse rate, blood pressure, alcohol-level etc.)
Mobile Gaming
It can completely offload game engine requiring large
computing resource (e.g., graphic rendering) to the server in
the cloud
Mobile Commerce
M-commerce allows business models for commerce using
mobile (Mobile financial, mobile advertising, mobile shopping)
18
Mobile Cloud Computing – Applications?
Mobile Learning
M-learning combines e-learning and mobility
Pedestrian crossing guide for blind Traditional m-learning has limitations on high cost of
and visually-impaired devices/network, low transmission rate, limited
educational resources
Mobile currency reader for blind Cloud-based m-learning can solve these limitations
and visually impaired Enhanced communication quality between students
Lecture transcription for hearing and teachers
impaired students Help learners access remote learning resources
A natural environment for collaborative learning
Assistive Technologies
19
MuSIC: Mobility-Aware Optimal Service Allocation in Mobile Cloud Computing
User Mobility introduces new complexities in enabling an optimal decomposition of tasks that can
execute cooperatively on mobile clients and the tiered cloud architecture while considering multiple
QoS goals such application delay, device power consumption and user cost/price.
Apart from scalability and access issues with the increased number of users, mobile applications are
faced with increased latencies and reduced reliability
As a user moves, the physical distance between the user and the cloud resources originally
provisioned changes causing additional delays
Further, the lack of effective handoff mechanisms in WiFi networks as user move rapidly causes an
increase in the number of packet losses
In other words, user mobility, if not addressed properly, can result in suboptimal resource mapping
choices and ultimately in diminished application QoS
Efficient techniques for dynamic mapping of resources in the presence of mobility; using
a tiered cloud architecture, to meet the multidimensional QoS needs of mobile users
Location-time workflow (LTW) as the modeling framework to model mobile applications and
capture user mobility. Within this framework, mobile service usage patterns as a function of
location and time has been formally modelled
Given a mobile application execution expressed as a LTW, the framework optimally partitions
the execution of the location-time workflow in the 2-tier architecture based on a utility metric
that combines service price, power consumption and delay of the mobile applications
Location Map:
It is a partition of the 2-D space/region Mobile User Trajectory:
in which mobile hosts and cloud resources are located
The trajectory of a mobile user, uk, is represented as a list of tuples of the
User Service Set: form {(1; lk); …; (n; lm)} where (i; lj) implies that the mobile user is in
The set of all services that a user location lj for time duration I
has on his own device (e.g. decoders, image editors
etc.) Center of Mobility:
It is the location where (or near where) a mobile user uk spends most of its time
Combination of the mobile application workflow concept with a user trajectory to model the mobile users
and the requested services in their trajectory.
Location-Time Workflow
As the number of vehicles increases, there is an increasing trend of insufficient parking spaces in many large
cities, and this problem is gradually getting worse
With the proliferation of wireless sensor networks (WSNs) and cloud computing, there exists strong potential
to alleviate this problem using context information (e.g., road conditions and status of parking garages) to
provide context-aware dynamic parking services
Cloud Assisted parking services (traditional parking garages and dynamic parking services along the road) and
parking reservation service using smart terminals such as smartphones.
As the number of vehicles increases, there is an increasing trend of insufficient parking spaces in many large
cities, and this problem is gradually getting worse
With the proliferation of wireless sensor networks (WSNs) and cloud computing, there exists strong potential
to alleviate this problem using context information (e.g., road conditions and status of parking garages) to
provide context-aware dynamic parking services
Cloud Assisted parking services (traditional parking garages and dynamic parking services along the road) and
parking reservation service using smart terminals such as smartphones.
28
Traditional parking garages:
The context information of each
parking space detected by a WSN
is forwarded to the traffic cloud
by WSNs, third-generation (3G)
communications, and the
Internet.
The collected data are processed
in the cloud and then selectively
transmitted to the users.
This is helpful for providing more
convenience services and
evaluating the utilization levels of
the parking garage.
Also, the status of the parking
garage may be dynamically
published on a nearby billboard
to users who have no ability to
get the status by smart terminals.
30
A Case Study: Context Aware Dynamic Parking Service
Three aspects, including service planning of traffic authorities, reservation service process, and context-aware
optimization have been studied.
In order to make an effective prediction, researchers on vehicular social networks carry out traffic data
mining to discover useful information and knowledge from collected big data. The prediction process
depends on classifying the influence factors and designing a decision tree
By the method of probability analysis, the traffic authorities dynamically arrange whether the road
can be authorized to provide context-aware parking services. In some particular cases, a fatal factor
may directly affect the decision making. For example, when a typhoon is approaching, traffic
authorities may immediately terminate services
31
A Case Study: Context Aware Dynamic Parking Service
Parking reservation services:
The status of a parking space can be monitored as determined by the corresponding system, and
subsequently updated in the traffic cloud.
The drivers or passengers can quickly obtain the parking space’s information by various smart
terminals such as smartphones. If a proper parking space cannot be found, further search scope is
extended.
Within a given time, we may log into the traffic cloud and subscribe to a parking space.
32
A Case Study: Context Aware Dynamic Parking Service
33
A Case Study: Context Aware Dynamic Parking Service
Context-aware optimization:
The context information includes not only road conditions and the status of the parking garage, but
also the expected duration of parking as well.
Since the purpose of a visit to the place in question can determine the expected duration of parking,
this context information can be used to optimize the best parking locations for drivers.
For the parked vehicles, the expected duration of parking can be uploaded to the traffic cloud and
shared with potential drivers after analysis.
In this way, even when the parking garage has no empty parking spaces available, drivers still can
inquire as to the status of the parking garage and get the desired service by context-aware
optimization.
The proposed context-aware dynamic parking service is a promising solution for alleviating parking
difficulties and improving the QoS of CVC. Many technologies such as WSNs, traffic clouds, and
traffic data mining are enabling this application scenario to become a reality
34
Summary
Mobile cloud computing is one of the mobile technology trends in the future
because it combines the advantages of both MC and CC, thereby providing optimal
services for mobile users
MCC focuses more on user experience : Lower battery consumption , Faster
application execution
MCC architectures design the middleware to partition an application execution
transparently between mobile device and cloud servers
The applications supported by MCC including m-commerce, mlearning, and mobile
healthcare show the applicability of the MCC to a wide range.
The issues and challenges for MCC (i.e., from communication and computing sides)
demonstrates future research avenues and directions.
35
References
Dinh, Hoang T., et al. “A survey of mobile cloud computing: architecture, applications, and approaches.” Wireless communications and mobile
computing 13.18 (2013): 1587-1611
Z. Li, C. Wang, and R. Xu, “Computation offloading to save energy on handheld devices: a partition scheme,” in Proc 2001 Intl Conf on
Compilers, architecture, and synthesis for embedded systems (CASES), pp. 238-246, Nov 2001.
K. Kumar and Y. Lu, “Cloud Computing for Mobile Users: Can Offloading Computation Save Energy,” IEEE Computer, vol. 43, no. 4, April 2010
H. H. La and S. D. Kim, “A Conceptual Framework for Provisioning Context-aware Mobile Cloud Services,” in Proceedings of IEEE International
Conference on Cloud Computing (CLOUD), pp. 466, August 2010
Gordon, Mark S., et al. "COMET: Code Offload by Migrating Execution Transparently." OSDI. 2012.
Yang, Seungjun, et al. "Fast dynamic execution offloading for efficient mobile cloud computing." Pervasive Computing and Communications
(PerCom), 2013 IEEE International Conference on. IEEE, 2013
Shiraz, Muhammad, et al. "A review on distributed application processing frameworks in smart mobile devices for mobile cloud
computing."Communications Surveys & Tutorials, IEEE 15.3 (2013): 1294-1313
https://www.ibm.com/cloud-computing/learn-more/what-is-mobile-cloud-computing/
36
37
CLOUD COMPUTING
Fog Computing - I
2
Cloud Computing – Typical Characteristics
• Dynamic scalability: Application can handle increasing load by getting
more resources.
• No Infrastructure Management by User: Infrastructure is managed by
cloud provider, not by end-user or application developer.
• Metered Service: Pay-as-you-go model. No capital expenditure for public
cloud.
3
Issues with “Cloud-only” Computing
• Communication takes a long time Accident
due to human-smartphone Location
interaction. Datacenter
Location
• Datacenters are centralized, so all
Accident
the data from different regions Location
can cause congestion in core
network.
• Such a task requires very low Accident
response time, to prevent further Location
crashes or traffic jam.
4
Fog Computing
• Fog computing, also known as fogging/edge computing, it is a model in which data,
processing and applications are concentrated in devices at the network edge rather
than existing almost entirely in the cloud.
• The term "Fog Computing" was introduced by the Cisco Systems as new model to ease
wireless data transfer to distributed devices in the Internet of Things (IoT) network
paradigm
• CISCO’s vision of fog computing is to enable applications on billions of connected
devices to run directly at the network edge.
– Users can develop, manage and run software applications on Cisco framework of networked
devices, including hardened routers and switches.
– Cisco brings the open source Linux and network operating system together in a single
networked device
5
Fog Computing
• Bringing intelligence down from the
cloud close to the ground/ end-user.
• Cellular base stations, Network
routers, WiFi Gateways will be
capable of running applications.
• End devices, like sensors, are able to
perform basic data processing.
• Processing close to devices lowers
response time, enabling real-time
applications. Source: The Fog Computing Paradigm: Scenarios and Security Issues,
Ivan Stojmenovic and Sheng Wen
6
Fog Computing
• Fog computing enables some of transactions and resources at the edge of the
cloud, rather than establishing channels for cloud storage and utilization.
• Fog computing reduces the need for bandwidth by not sending every bit of
information over cloud channels, and instead aggregating it at certain access
points.
• This kind of distributed strategy, may help in lowering cost and improve
efficiencies.
7
Fog Computing - Motivation
• Fog Computing is a paradigm that extends Cloud and its
services to the edge of the network
• Fog provides data, compute, storage and application services to
the end-user
• Recent developments: Smart Grid, Start Traffic light, Connected
Vehicles, Software defined network
8
Fog Computing
Source: Internet
9
Fog Computing Enablers
• Virtualization : Virtual machines can be used in edge devices.
• Containers: Reduces the overhead of resource management by using light-weight
virtualizations. Example: Docker containers.
• Service Oriented Architecture: Service-oriented architecture (SOA) is a style of
software design where services are provided to the other components by
application components, through a communication protocol over a network.
• Software Defined Networking: Software defined networking (SDN) is an approach
to using open protocols, such as OpenFlow, to apply globally aware software
control at the edges of the network to access network switches and routers that
typically would use closed and proprietary firmware.
10
Fog Computing - not a replacement of Cloud Computing
• Fog/edge devices are there to help the Cloud datacenter to better response
time for real-time applications. Handshaking among Fog and Cloud
computing is needed.
• Broadly, benefits of Fog computing are:
– Low latency and location awareness
– Widespread geographical distribution
– Mobility
– Very large number of nodes
– Predominant role of wireless access
– Strong presence of streaming and real time applications
– Heterogeneity
11
FOG Advantages ?
• Fog can be distinguished from Cloud by its proximity to end-
users.
• Dense geographical distribution and its support for mobility.
• It provides low latency, location awareness, and improves
quality-of- services (QoS) and real time applications.
12
Security Issues
• Major security issues are authentication at different levels of gateways as
well as in the Fog nodes
• Man-in-the-Middle-Attack
• Privacy Issues
• In case of smart grids, the smart meters installed in the consumer’s home.
Each smart meter and smart appliance has an IP address. A malicious user
can either tamper with its own smart meter, report false readings, or
spoof IP addresses.
13
Limitations of Cloud Computing
• High capacity(bandwidth) requirement
• Client access link
• High latency
• Security
“Fog” Solution?
• Reduction in data movement across the network resulting in reduced congestion
• Elimination of bottlenecks resulting from centralized computing systems
• Improved security of encrypted data as it stays closer to the end user
14
Fog Computing and Cloud Computing
Source: Internet
15
Fog Computing and Cloud Computing
Source: Internet
16
Fog Computing Use-cases
• Emergency Evacuation Systems: Real-time information about currently
affected areas of building and exit route planning.
• Natural Disaster Management: Real-time notification about landslides,
flash floods to potentially affected areas.
• Large sensor deployments generate a lot of data, which can be pre-
processed, summarized and then sent to the cloud to reduce congestion in
network.
• Internet of Things (IoT) based big-data applications: Connected Vehicle,
Smart Cities, Wireless Sensors and Actuators Networks(WSANs) etc.
17
Applicability
• Smart Grids
• Smart Traffic Lights
• Wireless Sensors
• Internet of Things
• Software Defined Network
18
Fog Computing and IoT (Internet of Things)
Source: Fog Computing and Its Role in the Internet of Things, Flavio Bonomi, Rodolfo Milito, Jiang Zhu, Sateesh Addepalli
19
Internet of Things
Source: Internet of Things (IoT): A vision, architectural elements, and future directions, Jayavardhana Gubbi, Rajkumar Buyya, Slaven Marusic,
Marimuthu Palaniswami
20
Connected Vehicle (CV)
Source: Fog Computing and Its Role in the Internet of Things, Flavio Bonomi, Rodolfo Milito, Jiang Zhu, Sateesh Addepalli
21
Smart Grid and Fog Computing
Source: Source: The Fog Computing Paradigm: Scenarios and Security Issues, Ivan Stojmenovic and Sheng Wen
22
Fog computing in Smart Traffic Lights and Connected
Vehicles
Source: Source: The Fog Computing Paradigm: Scenarios and Security Issues, Ivan Stojmenovic and Sheng Wen
23
Thank You!!
24
24
CLOUD COMPUTING
Fog Computing - II
2
Fog Computing
Source: Internet
3
Fog Computing and Cloud Computing
Source: Internet
4
Fog Computing and Cloud Computing
Source: Internet
5
Fog Computing Use-cases
• Emergency Evacuation Systems: Real-time information about currently
affected areas of building and exit route planning.
• Natural Disaster Management: Real-time notification about landslides,
flash floods to potentially affected areas.
• Large sensor deployments generate a lot of data, which can be pre-
processed, summarized and then sent to the cloud to reduce congestion in
network.
• Internet of Things (IoT) based big-data applications: Connected Vehicle,
Smart Cities, Wireless Sensors and Actuators Networks(WSANs) etc.
6
Applicability
• Smart Traffic Lights
• Connected Vehicles
• Smart Grids
• Wireless Sensors
• Internet of Things
• Software Defined Network
7
Connected Vehicle (CV)
Source: Fog Computing and Its Role in the Internet of Things, Flavio Bonomi, Rodolfo Milito, Jiang Zhu, Sateesh Addepalli
8
Fog Computing in Smart Traffic Lights and Connected Vehicles
Source: Source: The Fog Computing Paradigm: Scenarios and Security Issues, Ivan Stojmenovic and Sheng Wen
9
Fog Computing and IoT (Internet of Things)
Source: Fog Computing and Its Role in the Internet of Things, Flavio Bonomi, Rodolfo Milito, Jiang Zhu, Sateesh Addepalli
10
Fog Computing and Smart Grid
Source: Source: The Fog Computing Paradigm: Scenarios and Security Issues, Ivan Stojmenovic and Sheng Wen
11
Fog Challenges
• Fog computing systems suffer from the issue of proper resource allocation
among the applications while ensuring the end-to-end latency of the
services.
• Resource management of the fog computing network has to be addressed
so that the system throughput increases ensuring high availability as well as
scalability.
• Security of Applications/Services/Data
12
Resource Management of Fog network
• Utilization of idle fog nodes for better throughput
• More parallel operations
• Handling load balancing
• Meeting the delay requirements of real-time applications
• Provisioning crash fault-tolerance
• More scalable system
13
Resource Management – Challenges
• Data may not be available at the executing fog node. Therefore, data fetching
is needed from the required sensor or data source.
• The executing node might become unresponsive due to heavy workload,
which compromises the latency.
• Choosing a new node in case of micro-service execution migration so that the
response time gets reduced.
• Due to unavailability of an executing node, there is a need to migrate the
partially processed persistent data to a new node. (State migration)
14
Resource Management – Challenges (contd…)
• Due to unavailability of an executing node, there is a need to migrate the
partially processed persistent data to a new node. (State migration)
• Final result has to transferred to the client or actuator within very less
amount of time.
• Deploying application components in different fog computing nodes ensuring
latency requirement of the components.
• Multiple applications may collocate in the same fog node. Therefore, the data
of one application may get compromised by another application. Data
security and integrity of individual applications by resource isolation has to be
ensured.
15
Resource Management – Approaches
• Execution migration to the nearest node from the mobile client.
• Minimizing the carbon footprint for video streaming service in fog computing.
• Emphasis on resource prediction, resource estimation and reservation,
advance reservation as well as pricing for new and existing IoT customers.
• Docker as an edge computing platform. Docker may facilitate fast
deployment, elasticity and good performance over virtual machine based
edge computing platform.
16
Resource Management – Approaches (contd…)
• Resource management based on the fluctuating relinquish probability of the
customers, service price, service type and variance of the relinquish
probability.
• Studying the base station association, task distribution, and virtual machine
placement for cost-efficient fog based medical cyber-physical systems. The
problem can be formulated into a mixed-integer non-linear linear program
and then they linearize it into a mixed integer linear programming (LP). LP-
based two-phase heuristic algorithm has been developed to address the
computation complexity.
17
Fog - Security Issues
• Major security issues are authentication at different levels of gateways as
well as in the Fog nodes
• Man-in-the-Middle-Attack
• Privacy Issues
• In case of smart grids, the smart meters installed in the consumer’s home.
Each smart meter and smart appliance has an IP address. A malicious user
can either tamper with its own smart meter, report false readings, or
spoof IP addresses.
18
Thank You!!
19
19
Cloud Computing
Use Case: Geospatial Cloud
Soumya K Ghosh
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
[email protected]
Geospatial Information
Geospatial Cloud
IIT Kharagpur Geo-Cloud
Data intensive
Computation Intensive
Variable Load on the GIS server demands
dynamic scaling in/out of resources
GIS requires high level of reliability and
performance
Uses Network intensive web services
Quality of Service
Security
…
Data
CSE, IIT Kharagpur
Heterogeneity Issue
GIS layers are often developed by diverse departments
relying on a mix of software and information systems
Issues to be resolved
Making data description homogeneous
Standard encoding for data
Standard mechanism for data sharing
Societal
Multi-User
(Enterprise)
Groups/Teams
Projects
Data sources:
• Central Data Repository within the cloud.
• External Data Repository providing data as WFS,WMS
services.
Highway
Local Roads
Merged Road Network
Canal River
Canal
River
Buffer on MergedCSE,
Water Network (Zoomed)
IIT Kharagpur
Challenges in Geospatial Cloud
Multi-tenancy
Ref: Internet/YouTube
Goal: Interoperability
Ref: Internet/YouTube
“Shipping”
Ref: Internet/YouTube
“Shipping”
Ref: Internet/YouTube
“Docker”
Ref: Internet/YouTube
Docker – Features
Ref: https://www.tutorialspoint.com/docker/
Docker – Components
• Docker for Mac − It allows one to run Docker containers on the Mac OS.
• Docker for Linux − It allows one to run Docker containers on the Linux OS.
• Docker for Windows − It allows one to run Docker containers on the Windows
OS.
• Docker Engine − It is used for building Docker images and creating Docker
containers.
• Docker Hub − This is the registry which is used to host various Docker images.
• Docker Compose − This is used to define applications using multiple Docker
containers.
Ref: https://www.tutorialspoint.com/docker/
Traditional Virtualization
• Server is the physical server that is used to
host multiple virtual machines.
• Host OS is the base machine such as Linux
or Windows.
• Hypervisor is either VMWare or Windows
Hyper V that is used to host virtual machines.
• One would then install multiple operating
systems as virtual machines on top of the
existing hypervisor as Guest OS.
• One would then host your applications on top
of each Guest OS. Ref: https://www.tutorialspoint.com/docker/
Docker – Architecture
Ref: https://www.tutorialspoint.com/docker/
Container?
• Containers are an abstraction at the app layer that packages code
and dependencies together.
• Multiple containers can run on the same machine and share the OS
kernel with other containers, each running as isolated processes in
user space.
Ref: https://www.docker.com/
Container (contd…)
• An image is a lightweight, stand-alone, executable package that includes everything
needed to run a piece of software, including the code, a runtime, libraries, environment
variables, and config files.
• Containers run apps natively on the host machine’s kernel. They have better
performance characteristics than virtual machines that only get virtual access to host
resources through a hypervisor. Containers can get native access, each one running in a
discrete process, taking no more memory than any other executable.
Ref: https://www.docker.com/
Containers and Virtual Machines
• Containers can share a single kernel, and the only information that needs to
be in a container image is the executable and its package dependencies,
which never need to be installed on the host system.
• These processes run like native processes, and can be managed individually
• Because they contain all their dependencies, there is no configuration
entanglement; a containerized app “runs anywhere”
Ref: https://www.docker.com/
Docker containers are lightweight
Ref: Internet
How does Docker work
Source: Internet
Containers and Virtual Machines Together
Ref: https://www.docker.com/
Why is Docker needed for applications?
• Application level virtualization.
Ref: https://www.docker.com/
Terminology - Container
• Runnable instance of an image
• ps: List all running containers
• ps –a: List all containers (incl. stopped)
• top: Display processes of a container
• start: Start a stopped container
• stop: Stop a running container
• pause: Pause all processes within a container
• rm: Delete a container
• commit: Create an image from a container
Ref: https://www.docker.com/
Dockerfile
• Create images automatically using a build script:
«Dockerfile»
• Can be versioned in a version control system like Git or
SVN, along with all dependencies
• Docker Hub can automatically build images based on
dockerfiles on Github
Ref: https://www.docker.com/
Docker Hub
• Public repository of Docker images
• https://hub.docker.com/
• Automated: Has been automatically built from Dockerfile
• Source for build is available on GitHub
Ref: https://www.docker.com/
Docker – Usage
• Docker is the world’s leading software container platform.
• Enterprises use Docker to build agile software delivery pipelines to ship new
features faster, more securely and with confidence for both Linux, Windows
Server, and Linux-on-mainframe apps.
Ref: https://www.docker.com/
Thank You!
CLOUD COMPUTING
Green Cloud
PROF. SOUMYA K. GHOSH
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
IIT KHARAGPUR
Cloud Computing
Cloud computing is a model for enabling convenient, on-demand network
access to a shared pool of configurable computing resources like networks,
servers, storage, applications, and services.
Source: Internet
Source: Internet
Green Cloud ?
• Green computing is the environmentally responsible and eco-friendly
use of computers and their resources.
• In broader terms, it is also defined as the study of designing,
manufacturing or engineering, using and disposing of computing
devices in a way that reduces their environmental impact.
• Green Cloud computing is envisioned to achieve not only efficient
processing and utilization of computing infrastructure, but also minimize
energy consumption.
Source: Internet
Cloud Advantages
• Reduce spending on technology infrastructure. Maintain easy access to information
with minimal upfront spending. Pay as you go based on demand.
• Globalize your workforce on the cheap. People worldwide can access the cloud,
provided they have an Internet connection.
• Streamline processes. Get more work done in less time with less people.
• Reduce capital costs. There’s no need to spend big money on hardware, software or
licensing fees.
• Improve accessibility. You have access anytime, anywhere, making your life so much
easier!
• Minimize licensing new software. Stretch and grow without the need to buy expensive
software licenses or programs.
• Improve flexibility. You can change direction without serious financial issues at stake.
Source: Internet
Cloud – Challenge
• Gartner Report 2007: IT industry contributes 2% of world's total CO2
emissions
• U.S. EPA Report 2007: 1.5% of total U.S. power consumption used by
data centers which has more than doubled since 2000 and costs $4.5
billion
Source: Internet
Importance of Energy
• Increased computing demand
• Data centers are rapidly growing
• Consume 10 to 100 times more energy per square foot than a typical office
building
Cooling IT Equipment
system 40%
45%
Power
distribution
15%
Two-tier DC architecture
• Access and Core layers
• 1 GE and 10 GE links
• Full mesh core network
• Load balancing using ICMP
Ppeak
Power consumption
frequencies 0 Fmin
CPU Frequency
Server load
Source: Internet
Performance <-> Energy Efficiency
As energy costs are increasing while availability decreases, there is a need to shift focus
from optimizing data center resource management for pure performance alone to optimizing
for energy efficiency while maintaining high service level performance.
Source: Internet
CSP Initiatives
• Cloud service providers need to adopt measures to ensure that their profit
margin is not dramatically reduced due to high energy costs.
• Amazon.com’s estimate the energy-related costs of its data centers amount to
42% of the total budget that include both direct power consumption and the
cooling infrastructure amortized over a 15-year period.
• Google, Microsoft, and Yahoo are building large data centers in barren desert
land surrounding the Columbia River, USA to exploit cheap hydroelectric
power.
Source: Internet
A Typical Green Cloud Architecture
Source: Internet
Green Broker
User
Source: Internet
Green Middleware
Source: Internet
Power Usage Effectiveness (PUE)
Source: Internet
Conclusions
• Clouds are essentially Data Centers hosting application services offered on a
subscription basis. However, they consume high energy to maintain their
operations.
=> high operational cost + environmental impact
• Presented a Carbon Aware Green Cloud Framework to improve the carbon
footprint of Cloud computing.
• Open Issues: Lots of research to be carried out for Maximizing Efficiency of
Green Data Centers and Developing Regions to benefit the most.
Source: Internet
Thank You!
CLOUD COMPUTING
Sensor Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
Motivation
Increasing adoption of sensing technologies (e.g.,
RFID, cameras, mobile phones)
Internet has become a source of real time
information (e.g., through blogs, social networks,
live forums) for events happening around us
3
Limitations of Sensor Networks
• Very challenging to scale sensor networks to large sizes
• Proprietary vendor-specific designs. Difficult for different sensor networks to be
interconnected
• Sensor data cannot be easily shared by different groups of users.
• Insufficient computational and storage resources to handle large-scale applications.
• Used for fixed and specific applications that cannot be easily changed once deployed.
• Slow adoption of large-scale sensor network applications.
4
Limitations of Cloud Computing!
The immense power of the Cloud can only be fully exploited if it is seamlessly
integrated into our physical lives.
That means – providing the real world’s information to the Cloud in real time
and getting the Cloud to act and serve us instantly.
That is – adding the sensing capability to the Cloud
5
Sensor Network
Cloud Server
Computing Platform Mobile Computing
Applications
Cloud Security
What is missing?
Cloud Storage
Cloud Economics
Social Networks
Services
Codes
6
1. Lets go to the A Motivating Scenario! 2. Sounds Good!
mountain peak! I. Please take your lunch as you appear hungry!
II. Carry drinking water – Water at that region is
contaminated
III. Use anti-UV skin cream
Source: Internet
7
Few insight from the example!
Cell phone records the tourist’s gestures and activates applications such as camera,
microphone, etc.
Cell phone produces very swift responses in real time after:
– Processing geographical data
– Acquiring tourist’s physiological data from wearable physiological
– Sensors (blood sugar, precipitation, etc.) and cross-comparing it with his medical records
– Speech recognition
– Image processing of restaurant’s logos and accessing their internet-based profiles
– Accessing tourist’s social network profiles to find out his friends
Ref: http://www.ntu.edu.sg/intellisys
8
Need to integrate Sensors with Cloud!
Acquisition of data feeds from numerous body area (blood sugar, heat,
perspiration, etc) and wide area (water quality, weather monitoring, etc.)
sensor networks in real time.
Real-time processing of heterogeneous data sources in order to make
critical decisions.
Automatic formation of workflows and invocation of services on the cloud
one after another to carry out complex tasks.
Highly swift data processing using the immense processing power of the
cloud to provide quick response to the user.
9
What is Sensor Cloud Computing?
An infrastructure that allows truly pervasive computation using sensors as interface
between physical and cyber worlds, the data-compute clusters as the cyber backbone
and the internet as the communication medium
It integrates large-scale sensor networks with sensing applications and cloud computing
infrastructures.
It collects and processes data from various sensor networks.
Enables large-scale data sharing and collaborations among users and applications on the cloud.
Delivers cloud services via sensor-rich devices.
Allows cross-disciplinary applications that span organizational boundaries.
10
Sensor Cloud?
Enables users to easily collect, access, process, visualize, archive, share
and search large amounts of sensor data from different applications.
Supports complete sensor data life cycle from data collection to the
backend decision support system.
Sensor cloud enables different networks, spread in a
Vast amount of sensor dataarea,
huge geographical can betoprocessed,
connect analyzed,
togetherand andstored
be using
computational
employedand storage resources
simultaneously of the cloud.
by multiple users on demand
Allows sharing of sensor resources by different users and applications
under flexible usage scenarios.
Enables sensor devices to handle specialized processing tasks.
11
Overview of Sensor-Cloud Framework
12
Overview of Sensor-Cloud Framework
Sensor-Cloud Proxy
Interface between sensor resources and the cloud fabric.
Manages sensor network connectivity between the sensor
resources and the cloud.
Exposes sensor resources as cloud services.
Manages sensor resources via indexing services.
Uses cloud discovery services for resource tracking.
Manages sensing jobs for programmable sensor networks.
Manages data from sensor networks
• Data format conversion into standard formats (e.g. XML)
• Data cleaning and aggregation to improve data quality
• Data transfer to cloud storage
Sensor-cloud proxy can be virtualized and lives on the cloud !
13
Overview of Sensor-Cloud Framework
14
Another Use case...
• Traffic flow sensors are widely deployed in large numbers in places/ cities.
• These sensors are mounted on traffic lights and provide real-time traffic flow
data.
• Drivers can use this data to better plan their trips.
• In addition, if the traffic flow sensors are augmented with low-cost humidity
and temperature sensors, they can provide a customized and local view of
temperature and heat index data on demand.
• The national weather service, on the other hand, uses a single weather
station to collect environmental data for a large area, which might not
accurately represent an entire region.
15
Overview of Sensor Cloud Infrastructure
19
Virtual Sensor Configurations
(a) one-to-many, many-to-one, and many-to-many, and (b) derived
21
Virtual Sensor Configurations
(a) one-to-many, many-to-one, and many-to-many, and (b) derived
22
Virtual Sensor Configurations
(a) one-to-many, many-to-one, and many-to-many, and (b) derived
23
Virtual Sensor Configurations
(a) one-to-many, many-to-one, and many-to-many, and (b) derived
Derived:
A derived configuration refers to a versatile
configuration of virtual sensors derived from a
combination of multiple physical sensors.
This configuration can be seen as a generalization of
the other three configurations, though, the
difference lies in the types of physical sensors with
which a virtual sensor communicates.
While in the derived configuration, the virtual
sensor communicates with multiple sensor types; in
the other three configurations, the virtual sensor
communicates with the same type of physical
sensors.
Derived sensors can be used in two ways: first, to
virtually sense complex phenomenon and second, to
substitute for sensors that aren’t physically
deployed.
24
Virtual Sensor Configurations
(a) one-to-many, many-to-one, and many-to-many, and (b) derived
25
A Layered Sensor Cloud Architecture
26
Summary
Sensor-Cloud infrastructure virtualizes sensors and provides the management mechanism for
virtualized sensors
Sensor-Cloud infrastructure enables end users to create virtual sensor groups dynamically by
selecting the templates of virtual sensors or virtual sensor groups with IT resources.
Sensor-Cloud infrastructure focuses on Sensor system management and Sensor data
management
Sensor clouds aim to take the burden of deploying and managing the network away from the
user by acting as a mediator between the user and the sensor networks and providing
sensing as a service.
27
References
Beng, Lim Hock. "Sensor cloud: Towards sensor-enabled cloud services." Intelligent Systems
Center Nanyang Technological University (2009)
http://www.ntu.edu.sg/intellisys
Sanjay et al. “Sensor Cloud: A Cloud of Virtual Sensors” , IEEE Software, 2014
Madoka et al. “Sensor-Cloud Infrastructure Physical Sensor Management with Virtualized
Sensors on Cloud Computing”
28
29
CLOUD COMPUTING
IoT Cloud
Prof. Soumya K Ghosh
Department of Computer Science and Engineering
IIT KHARAGPUR
1
Motivation
Increasing adoption of sensing technologies
(e.g., RFID, cameras, mobile phones)
Sensor devices are becoming widely
available
Example: Sensors in an electronic jacket can collect information about changes in external
temperature and the parameters of the jacket can be adjusted accordingly
Source: Internet 3
More “Things” are being connected!
Home/daily-life devices
Business
Public infrastructure
Health-care and so on…
4
Any time, Any place connectivity for Anyone and Anything!
5
Basic IoT Architecture
An IoT platform has basically three building
blocks
Things
Gateway
Network and Cloud
6
Several Aspects of IoT systems!
Scalability: Scale for IoT system applies in terms of the numbers of sensors and actuators
connected to the system, in terms of the networks which connect them together, in terms of the
amount of data associated with the system and its speed of movement and also in terms of the
amount of processing power required.
Big Data: Many more advanced IoT systems depend on the analysis of vast quantities of data. There
is a need, for example, to extract patterns from historical data that can be used to drive decisions
about future actions. IoT systems are thus often classic examples of “Big Data” processing.
Role of Cloud computing: IoT systems frequently involve the use of cloud computing platforms.
Cloud computing platforms offer the potential to use large amounts of resources, both in terms of
the storage of data and also in the ability to bring flexible and scalable processing resources to the
analysis of data. IoT systems are likely to require the use of a variety of processing software – and
the adaptability of cloud services is likely to be required in order to deal with new requirements,
firmware or system updates and offer new capabilities over time.
7
Several Aspects of IoT systems (contd…)
Real time: IoT systems often function in real time; data flows in continually about events in
progress and there can be a need to produce timely responses to that stream of events.
Highly distributed: IoT systems can span whole buildings, span whole cities, and even span the
globe. Wide distribution can also apply to data – which can be stored at the edge of the network or
stored centrally. Distribution can also apply to processing – some processing takes place centrally
(in cloud services), but processing can take place at the edge of the network, either in the IoT
gateways or even within (more capable types of) sensors and actuators. Today there are officially
more mobile devices than people in the world. Mobile devices and networks are one of the best
known IoT devices and networks.
Heterogeneous systems: IoT systems are often built using a very heterogeneous set of. This applies
to the sensors and actuators, but also applies to the types of networks involved and the variety of
processing components. It is common for sensors to be low-power devices, and it is often the case
that these devices use specialized local networks to communicate. To enable internet scale access
to devices of this kind, an IoT gateway is used
8
Cloud Computing!
Cloud computing enables Cloud
Computing Platform Server Mobile Computing
companies and applications,
which are system infrastructure
dependent, to be infrastructure-
less.
Applications
9
Cloud Computing!
Cloud
Computing Platform Server Mobile Computing
10
IoT Cloud Systems?
Recently, there is a wide adoption and deployment of Internet of Things (IoT) infrastructures
and systems for various crucial applications, such as logistics, smart cities, and healthcare.
This An
hasintegration
led to high demands
betweenonIoTdata
andstorage,
cloudprocessing, and management services in
services allows
cloud-based data centers, engendering strong integration needs between IoT and cloud
coordination among IoT and cloud services. That is, a cloud
services.
Cloudservice
servicescan
are request anprovide
mature and IoT service,
excellentwhich
elasticincludes several
computation IoTmanagement
and data
elements,
capabilities to reduce
for IoT. the amount
In addition, of sensing
as IoT systems becomedata or the cloud
complex, IoT management
service
techniques arecan requestemployed
increasingly cloud services
to manage toIoT
provision more resources
components
Thus,for future
cloud incoming
services now actdata
as computational and data processing platforms as well as
management platforms for IoT. From a high-level view, IoT appears to be well-integrated with
cloud data centers to establish a uniform infrastructure for IoT Cloud applications
11
Cloud Components for IoT
12
iCOMOT: An IoT Cloud System
Top layer represents typical IoT applications executed across IoT and Clouds.
Middle layer represents the software layer as an IoT cloud system built on top of various types of cloud services and IoT elements.
Bottom layer shows different tools and services from iCOMOT that can be used to monitor, control, and configure the software layer.
H.-L. Truong et al., “iCOMOT: Toolset for Managing IoT Cloud Systems,”
Demo, 16th IEEE Int’l Conf. Mobile Data Management, 2015
13
Infrastructure, Protocols and Software Platforms
for establishing an Internet of Things (IoT) Cloud system
He, Wu, Gongjun Yan, and Li Da Xu. "Developing vehicular data cloud
services in the IoT environment." IEEE Transactions on
15Industrial
Informatics 10.2 (2014): 1587-1595.
Services for IoT-based Vehicular Data Clouds
16
Architecture for Intelligent Parking Cloud service
17
Vacancy detections by Sensors
18
Parking cloud service
19
Summary
Internet of Things (IoT) is a dynamic and exciting area of IT. Many IoT systems are going to be
created over the next few years, covering wide variety of areas, like domestic, commercial,
industrial, health and government contexts
IoT systems have several challenges, namely scale, speed, safety, security and privacy
Cloud computing platforms offer the potential to use large amounts of resources, both in
terms of the storage of data and also in the ability to bring flexible and scalable processing
resources to the analysis of data
20
References
Cloud Standards Customer Council 2015, Cloud Customer Architecture for Big Data and
Analytics, Version 1.1 http://www.cloud-council.org/deliverables/CSCC-Customer-
Cloud-Architecture-for-Big-Data-andAnalytics.pdf
He, Wu, Gongjun Yan, and Li Da Xu. "Developing vehicular data cloud services in the
IoT environment." IEEE Transactions on Industrial Informatics 10.2 (2014): 1587-1595.
H.-L. Truong et al., “iCOMOT: Toolset for Managing IoT Cloud Systems,” demo, 16th
IEEE Int’l Conf. Mobile Data Management, 2015
Truong, Hong-Linh, and Schahram Dustdar. Principles for engineering IoT cloud
systems." IEEE Cloud Computing 2.2 (2015): 68-76
21
22
CLOUD COMPUTING
Course Summary and Research Areas
2
Course Summary (contd.)
• Data Management in Cloud Computing
• Data, Scalability & Cloud Services
• Database & Data Stores in Cloud
• GFS, HDFS, Map-Reduce paradigm
• Cloud Security
• Identity & Access Management
• Access Control
• Trust, Reputation, Risk
• Authentication in cloud computing
• Case Study on Open Source and Commercial Clouds
• Research trend - Fog Computing, Sensor Cloud, Container Technology, Green
Cloud etc.
3
Cloud Computing – Research Areas
4
Cloud Infrastructure and Services
• Cloud Computing Architectures
• Storage ad Data Architectures
• Distributed and Cloud Networking
• Infrastructure Technologies
5
Cloud Management, Operations and Monitoring
• Cloud Composition, Service Orchestration
• Cloud Federation, Bridging, and Bursting
• Cloud Migration
• Hybrid Cloud Integration
• Green and Energy Management of Cloud Computing
• Configuration and Capacity Management
• Cloud Workload Profiling and Deployment Control
• Cloud Metering, Monitoring, Auditing
• Service Management
6
Cloud Security
• Data Privacy
• Access Control
• Identity Management
• Side Channel Attacks
• Security-as-a-Service
7
Performance, Scalability, Reliability
• Performance of cloud systems and Applications
• Cloud Availability and Reliability
• Micro-services based architecture
8
Systems Software and Hardware
• Virtualization Technology
• Service Composition
• Cloud Provisioning Orchestration
• Hardware Architecture support for Cloud Computing
9
Data Analytics in Cloud
• Analytics Applications
• Scientific Computing and Data Management
• Big data management and analytics
• Storage, Data, and Analytics Clouds
10
Cloud Computing – Service Management
• Services Discovery and Recommendation
• Services Composition
• Services QoS Management
• Services Security and Privacy
• Semantic Services
• Service Oriented Software Engineering
11
Cloud and Other Technologies
• Fog Computing
• IoT Cloud
• Container Technology
12
Thank You!
13
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 09: Cloud Computing Paradigm
Lecture 41: Cloud–Fog Computing ‐ Overview
Cloud‐Fog Paradigm ‐ Overview
Cloud‐Fog‐Edge/IoT
Fog Computing
NPTEL
Cloud Computing: “Anything”‐as‐a‐Service
NPTEL
Fog Computing
• Fog computing a model in which data, processing and applications are
concentrated in devices at the network edge rather than existing
almost entirely in the cloud.
• The term "Fog Computing" was introduced by the Cisco Systems as
new model to ease wireless data transfer to distributed devices in the
Internet of Things (IoT) network paradigm
•
NPTEL
Vision of fog computing is to enable applications on billions of
connected devices to run directly at the network edge.
Cloud vs Fog Computing
NPTEL
Cloud‐Fog‐Edge Computing
Source: https://www.learntechnology.com/network/fog‐computing/
Case Study: Health Cloud‐Fog‐Edge
Level0: Cloud
Level1: ISP
Level2: AreaGW
Information
Level3: Mobile
Level4: IoT devices
NPTEL
Case Study: Health Cloud‐Fog‐Edge
Device Configuration
Latency NPTEL
Case Study: Health Cloud‐Fog‐Edge
NPTEL
•
Network usage is very low in
case of Fog architecture as
only for few positive cases,
the Confirmatory module
residing on Cloud is
accessed.
• In case of Cloud based
NPTEL
architecture, the usage is
high as all modules are now
on Cloud.
Ph i l T l
Case Study: Energy Consumptions
3.00E+004
Energy Consumption
Energy Consumption (in kJ)
2.50E+004
DC Energy
2.00E+004 Mobile Energy
Edge Energy
1.50E+004
1.00E+004 NPTEL
5.00E+003
0.00E+000
Config 2 – Config 2 – Config 3 – Config 3 – Config 4 – Config 4 –
Fog Cloud Fog Cloud Fog Cloud
Physical Topology Configurations
Case Study: Prototype Implementation
Lab based setup:
– Raspberry Pi as Fog Devices
– AWS as Cloud
Use different dataset and
customize formulae for analysis
Changes in Resource Allocation
Policy in terms of:
NPTEL
– Customized physical devices
– Customized requirements of
Application Modules
– Module Placement policy
Cisco White Paper. 2015. Fog Computing and the Internet of Things: Extend the
Cloud to Where the Things Are.
Gupta H, Vahid Dastjerdi A, Ghosh SK, Buyya R. iFogSim: A toolkit for modeling
and simulation of resource management techniques in the Internet of Things,
Edge and Fog computing environments. Softw Pract Exper. 2017;47:1275‐296.
https://doi.org/10.1002/spe.2509
Mahmud, Md & Buyya, Rajkumar. (2019). Modeling and Simulation of Fog and
Edge Computing Environments Using iFogSim Toolkit: Principles and Paradigms.
10.1002/9781119525080.ch17 NPTEL
Mahmud, Md and Koch, Fernando and Buyya, Rajkumar. (2018). Cloud‐Fog
Interoperability in IoT‐enabled Healthcare Solutions.
10.1145/3154273.3154347. In proceedings of 19th International Conference on
Distributed Computing and Networking, January 4–7, 2018, Varanasi, India.
ACM, NewYork, NY, USA.
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 09: Cloud Computing Paradigm
Lecture 42: Resource Management ‐ I
Cloud‐Fog Paradigm – Resource
Management Issues
NPTEL
Cloud Computing
Resource Management
NPTEL
Service Placement
Resource Management ‐ I
NPTEL
Challenges in “Cloud‐only” scenario
• Processing IoT applications directly in the cloud may not be the most
efficient solution for each IoT scenario, especially for time‐sensitive
applications.
• A promising alternative is to use fog and edge computing, which
address the issue of managing the large data bandwidth needed by
end devices.
•
NPTEL
These paradigms impose to process the large amounts of generated
data close to the data sources rather than in the cloud.
• One of the considerations of cloud‐based IoT environments is resource
management, which typically revolves around resource allocation,
workload balance, resource provisioning, task scheduling, and QoS to
achieve performance improvements.
Fog‐Edge to support Cloud Computing
• Latency issue: May involve transport of data from each single sensor to a
data center over a network, process these data, and then send
instructions to actuators.
• Fog and edge computing may aid cloud computing in overcoming these
limitations.
• Fog computing and edge computing are no substitutes for cloud
NPTEL
computing as they do not completely replace it.
• Oppositely, the three technologies can work together to grant
improved latency, reliability, and faster response times.
Cloud‐Fog Paradigm
NPTEL
Ref: Agarwal, S.; Yadav, S.; Yadav, A.K. An efficient architecture and algorithm for resource provisioning in fog computing. Int. J.
Inf. Eng. Electronic Bus. (IJIEEB) 2016, 8, 48–61.
Fog‐Edge to support Cloud Computing
• Cloud–fog environment model, typically, is composed of three layers: a
client layer (edge), a fog layer, and a cloud layer.
• Fog layer is to accomplish the requirement of resources for clients.
• If there is no/limited availability of resources in the fog layer, then the
request is passed to the cloud layer.
• Main functional components of this model are:
NPTEL
– Fog server manager employs all the available processors to the client.
– Virtual machines (VMs) operate for the fog data server, process them, and
then deliver the results to the fog server manager.
– Fog servers contain fog server manager and virtual machines to manage
requests by using a ’server virtualization technique’.
Fog‐Edge to support Cloud Computing
• Trend is to decentralize some of the computing resources available in
large Cloud data centers by distributing them towards the edge of the
network closer to the end‐users and sensors
• Resources may take the form of either (i) dedicated “micro” data centers
that are conveniently and safely located within public/private
infrastructure or (i) Internet nodes, such as routers, gateways, and
NPTEL
switches that are augmented with computing capabilities.
• A computing model that makes use of resources located at the edge of
the network is referred to as “edge computing”.
• A model that makes use of both edge resources and the cloud is referred
to as “fog computing”
Resource Management in Cloud‐Fog‐Edge
NPTEL
Cheol-Ho Hong, Blesson Varghese, Resource Management in Fog/Edge Computing: A Survey on Architectures, Infrastructure, and
Algorithms, ACM Computing Surveys, Vol 52(5), October 2019, pp 1–37.
Resource Management Approaches
• Architectures ‐ the architectures used for resource management in
fog/edge computing are classified on the basis of data flow, control, and
tenancy
• Infrastructure ‐ The infrastructure for fog/edge computing provides
facilities composed of hardware and software to manage the
computation, network, and storage resources for applications utilizing the
NPTEL
fog/edge.
• Algorithms ‐ There are several underlying algorithms used to facilitate
fog/edge computing.
Resource Management Approaches ‐ Architectures
Architectures
• Data Flow: Based on the direction of movement of workloads and data in the
computing ecosystem. For example, workloads could be transferred from the user
devices to the edge nodes or alternatively from cloud servers to the edge nodes.
• Control: Based on how the resources are controlledNPTEL
in the computing ecosystem. For
example, a single controller or central algorithm may be used for managing a number
of edge nodes. Alternatively, a distributed approach may be employed.
• Tenancy: Based on the support provided for hosting multiple entities in the
ecosystem. For example, either a single application or multiple applications could be
hosted on an edge node.
Resource Management Approaches ‐ Infrastructure
Infrastructure
Resource Management
NPTEL
Service Placement
Service Placement Problem in Fog and Edge Computing
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge
Computing. ACM Comput. Surv. 53, 3, Article 65 (June 2020), 35 pages.
Service Placement Problem in Fog and Edge Computing
• Fog Computing is a highly virtualized platform that offers computational
resources, storage, and control between end‐users and Cloud servers.
• It is a new paradigm in which centralized Cloud coexists with distributed
edge nodes and where the local and global analyses are performed at
the edge devices or forwarded to the Cloud.
• Fog infrastructure consists of IoT devices (End layer), Fog Nodes, and at
NPTEL
least one Cloud Data Center (Cloud layer), with following characteristics:
• Location awareness and low latency
• Better bandwidth utilization
• Scalable
• Support for mobility
Service Placement Problem in Fog and Edge Computing
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge
Computing. ACM Comput. Surv. 53, 3, Article 65 (June 2020), 35 pages.
Deployment (Application Placement) on Cloud‐Fog‐Edge framework
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge
Computing. ACM Comput. Surv. 53, 3, Article 65 (June 2020), 35 pages.
Application Placement on Cloud‐Fog‐Edge framework
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge
Computing. ACM Comput. Surv. 53, 3, Article 65 (June 2020), 35 pages.
Service Placement – Optimization Strategies
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3, Article 65
(June 2020), 35 pages.
Control
• Centralized
• Distributed
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3,
Article 65 (June 2020), 35 pages.
Hardware
• Fog/edge computing forms a computing environment that uses low‐
power devices, namely, mobile devices, routers, gateways, home
systems.
• Combination of these small‐form‐factor devices, connected to network,
enables a cloud computing environment that can be leveraged by a rich
set of applications processing Internet of Things (IoT) and cyber‐physical
systems (CPS) data. NPTEL
Hardware (contd..)
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3,
Article 65 (June 2020), 35 pages.
System Software
• System software for the fog/edge is a platform designed to operate
directly on fog/edge devices
• Manage the computation, network, and storage resources of the
devices.
• System software needs to support multi‐tenancy and isolation, because
fog/edge computing accommodates several applications from different
tenants.
NPTEL
• Two categories
• System Virtualization
• Network Virtualization
System Software (contd..)
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3,
Article 65 (June 2020), 35 pages.
Middleware
• Middleware provides complementary services to system software.
• Middleware in fog/edge computing provides performance monitoring,
coordination and orchestration, communication facilities, protocols etc.
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3,
Article 65 (June 2020), 35 pages.
ALGORITHMS
• Algorithms used to facilitate fog/edge computing. Four major
algorithms.
• Discovery: identifying edge resources within the network that can be
used for distributed computation
• Benchmarking: capturing the performance of resources for decision‐
making to maximize the performance of deployments
• Load‐balancing: distributing workloadsNPTEL across resources based on
different criteria such as priorities, fairness etc.
• Placement: identifying resources appropriate for deploying a
workload.
ALGORITHMS (contd..)
NPTEL
Ref: Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3, Article 65 (June 2020), 35 pages.
Cheol‐Ho Hong, Blesson Varghese, Resource Management in Fog/Edge Computing: A Survey
on Architectures, Infrastructure, and Algorithms, ACM Computing Surveys, Vol 52(5), October
2019, pp 1–37.
Farah Aït Salaht, Frédéric Desprez, and Adrien Lebre. 2020. An Overview of Service Placement
Problem in Fog and Edge Computing. ACM Comput. Surv. 53, 3, Article 65 (June 2020), 35
pages.
Agarwal, S.; Yadav, S.; Yadav, A.K. An efficient architecture and algorithm for resource
provisioning in fog computing. Int. J. Inf. Eng. Electronic Bus. (IJIEEB) 2016, 8, 48–61.
NPTEL
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 09: Cloud Computing Paradigm
Lecture 44: Cloud Federation
Cloud‐Fog Paradigm – Resource
Management Issues
NPTEL
Cloud Computing
Resource Management
NPTEL
Service Placement
Cloud Federation
NPTEL
Cloud Federation?
• A federated cloud (also called cloud federation) is the
deployment and management of multiple external and
internal cloud computing services to match business needs.
• A federation is the union of several smaller parts that
perform a common action.
NPTEL
[Ref: http://whatis.techtarget.com/definition/federated‐cloud‐cloud‐federation]
Cloud Federation?
Collaboration between Cloud Service Providers (CSPs) to achieve:
• Capacity utilization
• Inter‐operability
• Catalog of services
• Insight about providers and SLA’s
NPTEL
Federation ‐ Motivation
• Different CSPs join together to form a federation
• Benefits:
– Maximize resource utilization
– Minimize power consumption
– Load balancing NPTEL
– Global utility
– Expand CSP’s global foot prints
Federation ‐ Characteristics
• To overcome the current limitations of cloud computing such as service
interruptions, lack of interoperability and degradation of services.
• Many inter‐cloud organizations have been proposed.
• Cloud federation is an example of an inter‐cloud organization.
• This contract can enable a certain level of control over remote resources
(for example, allowing the definition of affinity rules to force two or more
remote VMs to be placed in the same NPTEL
physical cluster);
• May agree to the interchange of more detailed monitoring information
(for example, providing information about the host where the VM is
located, energy consumption, and so on);
• May enable some advanced networking features among partner clouds
(for example, the creation of virtual networks across site boundaries).
Tightly Coupled Federation
• In this case the clouds are normally governed by the same cloud
adminstration.
• A cloud instance can have advanced control over remote resources—for
example, allowing decisions about the exact placement of a remote VM—
and can access all the monitoring information available about remote
resources.
NPTEL
• May allow other advanced features, including the creation of cross‐site
networks, cross‐site migration of VMs, implementation of high availability
techniques among remote cloud instances, and creation of virtual storage
systems across site boundaries.
Cloud Federation Architectures
NPTEL
Federation Broker
NPTEL
NPTEL
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 10: Cloud Computing Paradigm
Lecture 45: Cloud Migration ‐ I
VM Migration ‐ Basics
Migration strategies
NPTEL
Virtual Machine (VM)
VM Migration
NPTEL
VM Migration
NPTEL
VM Migration
• VM Migration – It is process to move running applications or
VMs from one physical server/ host to another host.
• Processor state, storage, memory and network connection
are moved from one host to another host
• Why to migrate VMs?
NPTEL
– Distribute VM load efficiently across servers in a cloud
– System maintenance
Virtualization
App App App App App
Guest OS Guest OS Guest OS
(Linux) (NetBSD) (Windows)
VM VM VM
Virtual Machine Monitor (VMM) / Hypervisor
Hardware
NPTEL
App App App App App
Guest OS Guest OS Guest OS
(Linux) (NetBSD) (Windows)
VM VM VM
Virtual Machine Monitor (VMM) / Hypervisor
Hardware
VM Migration – Needs
• Load Balancing: For fair distribution of workload among computing
resources.
• Maintenance: For server maintenance VMs can be migrated transparently
from one server to another.
• Manage Operational Parameters: To reduce operational parameters like
power consumption, VMs can be consolidated on minimal number of
servers. Under‐utilized servers can be put on a low power mode to reduce
power consumption. NPTEL
• Quality‐of‐Service violation: When the service provider fails to meet the
desired quality‐of‐services (QoS) a user can migrate his VM to another
service provider.
• Fault Tolerance: In case of failure, VMs can be migrated from one data
center to another where they can be executed
VM Migration – Types
• Cold or Non‐Live Migration: In case of cold migration the VM
executing on the source machine is turned off or suspended
during the migration process.
NPTEL
Migration ‐ Concerns
• Minimize the downtime
– Downtime refers to the total amount of time services remain
unavailable to the users.
• Minimize total migration time
– Migration time refers to the total time taken to move a VM from
the source host to the destinationNPTEL
host. It can be considered as
the total time taken for the entire migration process.
• Migration does not unnecessarily disrupt active services
through resource contention (e.g., CPU, network bandwidth)
with the migrating OS.
What to Migrate?
• CPU context of VM, contents of Main Memory
• Disk
– If NAS (network attached storage) that is accessible from both hosts, or local disk is
mirrored – migrating disk data may not be critical
• Network: assume both hosts on same LAN
– Migrate IP address, advertise new MAC address to IP mapping via ARP reply
– NPTEL
Migrate MAC address, let switches learn new MAC location
– Network packets redirected to new location (with transient losses)
• I/O devices
– Virtual I/O devices easier to migrate, direct device assignment of physical devices to
VMs may be difficult to migrate
Memory Migration ‐ Steps
• Push - Source VM continues running while certain pages are pushed across
the network to the new destination. To ensure consistency, pages modified
during this process must be re-sent.
• Stop‐and‐copy - Source VM is stopped, pages are copied across to the
destination VM, then the new VM is started.
• Pull - The new VM executes and, if it accesses a page that has not yet been
copied, this page is faulted in (“pulled”) across the network from the source
VM. NPTEL
Pure Stop‐and‐Copy
– Simple but both downtime and total migration time are proportional to the
amount of physical memory allocated to the VM.
– May lead to an unacceptable outage if the VM is running a live service.
Live Migration ‐ Phases
• Pre‐Copy Phase: It is carried out over several rounds. The VM continues
to execute at the source, while its memory is copied to the destination.
• Pre‐copy Termination Phase: Stopping criteria of Pre‐Copy phase takes
one of the following thresholds into account: (i) The number of rounds
exceeds a threshold. (ii) The total memory transmitted exceeds a
threshold. (iii) The number of dirtied pages in the previous round drops
below a threshold. NPTEL
• Stop‐and‐Copy Phase: In this phase, execution of the VM to be migrated
is suspended at the source. Then, the remaining dirty pages and, state of
the CPU is copied to the destination host, where the execution of VM is
resumed.
Iterative Pre Copy Live Memory Migration
• Pre‐ copy Phase:
– This phase may be carried out over several rounds.
– The VM continues to execute at the source host, while its memory is copied to the
destination host.
– Active pages of the VM to be migrated are copied iteratively in each round.
– During the copying process some active page might get dirtied at the source host, which
are again resent in the subsequent rounds to ensure memory consistency.
• Pre copy‐termination phase: Stopping criteria ‐ options
– Number of rounds exceeds a threshold. NPTEL
– Total memory transmitted exceeds a threshold.
– Number of dirtied pages in the previous round drops below a threshold.
• Stop‐and‐Copy Phase:
– In this phase, execution of the VM to be migrated is suspended at the source.
– Then, the remaining dirty pages and, state of the CPU is copied to the destination host,
where the execution of VM is resumed.
• Restarting Phase: Restart the VM on destination server.
Post‐copy Live Memory Migration
• Stop Phase: Stop the source VM and copy the CPU state to the
destination VM.
• Restart Phase: Restart the destination VM.
• On‐demand Copy: Copy the VM memory according to the demand.
Migration strategies
NPTEL
Virtual Machine (VM)
VM Migration
NPTEL
VM Migration (contd.)
NPTEL
VM Migration
• VM Migration – It is process to move running applications or
VMs from one physical server/ host to another host.
• Processor state, storage, memory and network connection
are moved from one host to another host
• Why to migrate VMs?
NPTEL
– Distribute VM load efficiently across servers in a cloud
– System maintenance
VM Live Migration – Requirements
• Load Balancing: When the load is considerably unbalanced and impending
downtime often require simultaneous VM (s) migration.
• Fault tolerance: Fault is another challenge to guarantee the critical service
availability and reliability. Failures should be anticipated and proactively
handled.
• Power management: Switching the idle mode server to either sleep mode
or off mode based on resource demands, that leads to energy saving. VM
live migration is a good technique for cloudNPTEL
power efficiency.
• Resource sharing: Challenge of limited hardware resources like memory,
cache, and CPU cycles can be solved by relocating VM’s from over‐loaded
server to under‐loaded server.
• System maintenance: Physical system required to be upgraded and serviced,
so VMs of that physical server must be migrated to an alternate server so
that services are available to users without interruption
Live Migration – Pre‐copy Approach
• Uses iterative push phase that is followed by stop‐and‐copy phase.
• Because of iterative procedure, some memory pages have been
updated/modified, called dirty pages are regenerated on the source
server during migration iterations.
• Dirty pages resend to the destination host in a future iteration, hence
some of the or frequently access memory pages are sent several times. It
causes long migration time. NPTEL
• In the first phase, all pages are transferred while VM running
continuously on the source host. Further round(s), dirty pages are
re‐sent.
Live Migration – Pre‐copy Approach (contd.)
• Second phase is termination phase which depends on the defined
threshold. The termination is executed if any one out of three conditions:
(i) the number of iterations exceeds pre‐defined iterations, or (ii) the total
amount of memory that has been sent or (iii) the number of dirty pages
in just previous round fall below the defined threshold.
• In the last, stops‐and‐copy phase, migrating VM is suspended at source
NPTEL
server, after that move processors state and remaining dirty pages.
• When VM migration process is completed in the correct way then
hypervisor resumes migrant VM on the destination server.
• KVM, Xen, and VMware hypervisor use the pre‐copy technique for live
VM migration.
Live Migration – Pre‐copy Approach
NPTEL
Live Migration – Post‐copy Approach
• In post‐copy migration technique, processor state transfer before
memory content and then
• VM could be started at the destination server.
• Post‐copy VM migration technique investigates demand paging, active
push, pre‐paging etc. approaches for prefetching of memory pages at the
destination server. NPTEL
Stop Phase: Stop the source VM and copy the CPU state to the
destination VM.
Restart Phase: Restart the destination VM.
On‐demand Copy: Copy the VM memory according to the demand.
Live Migration – Post‐copy Approach
NPTEL
VM Migration – Analysis
• Let Tmig be the total migration time.
• Let Tdown be the total down time.
• For non‐live migration of a single VM the migration time Tmig can be
calculated as follows:
Tmig= Vm/R .
where, Vm is the size i.e. memory of the VM and R is the transmission rate.
• NPTEL
In non‐live migration, down time is same as the migration time because the
services of the VM is suspended during the entire migration process.
Tdown= Tmig
Note: Transmission rate remains fixed for the entire duration of migration.
VM Migration – Analysis (contd.)
• Let n represent the total number of iterations in the pre copy cycle.
• Let Ti,j represents the total time that the jth iterative transmits the ith virtual
machine’s memory.
• Vm : the memory of a VM.
• Vth : threshold for stopping the iterations.
• nmax : maximum number of iterations. NPTEL
• r = (P*D)/R.
where P is page size and D is the dirtying rate, R is the transmission rate.
• Tres denotes the time taken to restart the VM on the destination server.
VM Migration – Analysis (contd.)
• Pre-copy migration mechanism: the VMs memory can be migrated
iteratively.
• We can compute the total migration time Ti,mig of the ith VM as follows.
V
• Ti,mig ∑ Ti,𝑗 = m )+Tres
Vm
• Ti,down rn( +Tres
NPTEL
VM Migration – Analysis (contd.)
Vm
• Round 0 : t0 =
(P∗D) (P∗D) Vm V
• Round 1: t1 = *t0 = * =r*( m
(P∗D) (P∗D) Vm Vm
• Round 2: t2 = *t1 = *(r* ) =r2(
(P∗D) (P∗D) V V
• Round 3: t3 = *t2 = *(r2 m ) =r3( m .
…….. . NPTEL
(P∗D) (P∗D) Vm Vm
• Round n−1: tn‐1= * tn‐2 = *(rn‐2∗ = rn‐1 *(
(P∗D) (P∗D) Vm Vm
• Round n (Stop and Copy): tn = * tn‐1 = (rn‐1 ∗ = rn (
Vm Vm n
• T = t0+t1+… +tn‐1 + tn = (1+r+r2 +r3 + … +rn‐1 + rn) = ).
VM Migration – Analysis (contd.)
Estimation of Number of Rounds (n)
• n= min(⌈𝑙𝑜𝑔𝑟 ⌉,nmax)
Multiple VMs Migration
• Generally multiple VMs are migrated from a source
host to the destination host.
• Typical strategies for migration multiple VMs:
– Serial Migration.
NPTEL
– Parallel migration.
Serial Migration
In case of serial migration of ‘m’ correlated VMs of same type the procedure
is as follows:
• The first VM that is selected to be migrated executes its pre‐ copy cycle and the
other (m‐1) VMs continue to provide services.
• As soon as the first VM enters into the stop and copy phase the remaining (m‐1)
VMs are suspended and are copied after the first VM completes its stop and copy
phase.
NPTEL
• Reason for stopping the remaining (m‐1) VMs is to stop those VMs from dirtying
memory.
• Assumption: each VM that is copied at full transmission rate (R).
• Downtime for the serial migration includes the stop and copy phase of the first VM,
the migration time for the (m‐1) VMs and the time to resume the VMs at the
destination host.
Serial Migration
• Consider there are ‘m’ VMs that are to be migrated serially.
• Migration time and downtime for serial migration strategy can be
calculated as follows:
m.Vm
Tsmig = ∑ Ti,𝑚𝑖𝑔 = )+Tres
Vm Vm NPTEL
Tsdown = . rn + (m 1 ) +Tres
Parallel Migration
• Major difference between parallel and serial migrations is that all ‘m’ VMs
start their pre‐copy cycles simultaneously.
• In fact each VM shares (R/m) of the transmission capacity.
• As the VM sizes are same and transmission rates are same the VMs begin
the stop and copy phase as the same time and they end the stop and
copy phase also at the same time.
• NPTEL
Since the stop and copy phase is executed in parallel and they consume
the same amount of time the downtime is in fact equivalent to the time
taken by the stop and copy phase for any VM added to the time taken to
resume the VMs at the destination host.
Parallel Migration
m.Vm n(p)
• Tp mig =∑ Ti,𝑚𝑖𝑔 = )+Tres
m.Vm
• Tpdown = . (m.r)n(p) +Tres
NPTEL
Kai Hwang, Geoffrey C. Fox, Jack J. Dongarra, Distributed and Cloud Computing ‐
From Parallel Processing to the Internet of Things, Morgan Kaufmann, Elsevier,
2012
Christian Limpach, Ian Pratt, Christopher Clark, Keir Fraser, Steven Hand, Jacob
Gorm Hansen, Eric Jul, Andrew Warfield, Live Migration of Virtual Machines, NSDI,
2005
Michael R. Hines and Kartik Gopalan,NPTEL
Post‐Copy Based Live Virtual Machine
Migration Using Adaptive Pre‐Paging and Dynamic Self‐Ballooning, 2009
Anita Choudhary, Mahesh Chandra Govil, Girdhari Singh, Lalit K. Awasthi,
Emmanuel S. Pilli, Divya Kapil, A critical survey of live virtual machine migration
techniques, Journal of Cloud Computing, Springer, 6(23), 2017
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 10: Cloud Computing Paradigm
Lecture 47: Container based Virtualization ‐ I
Containers
Container based Virtualization
Kubernetes
NPTEL
Docker Container
Container
Virtualization
NPTEL
Containers
NPTEL
Containers ‐ Introduction
● Virtualization helps to share resources among many
customers in cloud computing.
● Container is a lightweight virtualization technique.
● Container packages the code and all its dependencies so the
application runs quickly and reliably from one computing
environment to another.
NPTEL shipping, and
● Docker is an open platform for developing,
running applications.
● Kubernetes is an open‐source system for automating
deployment, scaling, and management of containerized
applications.
Containers ‐ Introduction
● Containers are packages of software that contain all of the
necessary elements to run in any environment.
● Containers virtualize the operating system and run anywhere, from
a private data center to the public cloud or even on a developer’s
personal laptop.
● Containers are lightweight packages of the application code
together with dependencies such as specific versions of
programming language runtimes and NPTEL
libraries required to run the
software services.
● Containers make it easy to share CPU, memory, storage, and
network resources at the operating systems level and offer a
logical packaging mechanism in which applications can be
abstracted from the environment in which they actually run.
Ref: https://cloud.google.com/learn/what‐are‐containers
Containers ‐ Needs
• Containers offer a logical packaging mechanism in which applications
can be abstracted from the environment in which they actually run.
• This decoupling allows container‐based applications to be deployed
easily and consistently, regardless of whether the target environment is
a private data center, the public cloud, or even a developer’s personal
laptop.
• Agile development: Containers allow the developers to move much
more quickly by avoiding concerns about NPTEL
dependencies and
environments.
• Efficient operations: Containers are lightweight and allow to use just the
computing resources one need – thus running the applications
efficiently.
• Run anywhere: Containers are able to run virtually anywhere. .
Ref: https://cloud.google.com/learn/what‐are‐containers
Containers – Major Benefits
• Separation of responsibility: Containerization provides a clear separation
of responsibility, as developers focus on application logic and
dependencies, while IT operations teams can focus on deployment and
management instead of application details such as specific software
versions and configurations.
• Workload portability: Containers can run virtually anywhere, greatly
easing development and deployment: on Linux, Windows, and Mac
operating systems; on virtual machines NPTEL
or on physical servers; on a
developer’s machine or in data centers on‐premises; and of course, in
the public cloud.
• Application isolation: Containers virtualize CPU, memory, storage, and
network resources at the operating system level, providing developers
with a view of the OS logically isolated from other applications.
Ref: https://cloud.google.com/learn/what‐are‐containers
Application Deployment
NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Traditional – Virtualized – Container Deployments
• Traditional deployment : Applications run on physical servers. There was no way to
define resource boundaries for applications in a physical server, and this caused resource
allocation issues.
• Virtualized deployment : Allows to run multiple Virtual Machines (VMs) on a single
physical server's CPU. Virtualization allows applications to be isolated between VMs. It
allows better utilization of resources in a physical server and allows better scalability. Each
VM is a full machine running all the components, including its own operating system, on
top of the virtualized hardware.
• NPTEL
Container deployment: Containers are similar to VMs, but they have relaxed isolation
properties to share the Operating System (OS) among the applications. Therefore,
containers are considered lightweight. A container has its own filesystem, share of CPU,
memory, process space, and more. As containers are decoupled from the underlying
infrastructure, they are portable across clouds and different OS distributions.
Containers and VMs
• VMs: a guest operating system such as Linux or Windows runs on top of a host
operating system with access to the underlying hardware.
• Containers are often compared to virtual machines (VMs). Like virtual
machines, containers allow one to package the application together with
libraries and other dependencies, providing isolated environments for running
your software services.
• However, the containers offer a far more lightweight unit for developers and IT
NPTEL
Ops teams to work with, carrying a myriad of benefits.
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Kubernetes Components
• A Kubernetes cluster consists of a set of worker machines, called nodes, that run
containerized applications. Every cluster has at least one worker node.
• The worker node(s) host the Pods that are the components of the application
workload.
• The control plane manages the worker nodes and the Pods in the cluster.
• In production environments, the control plane usually runs across multiple
computers and a cluster usually runs multiple nodes, providing fault‐tolerance
and high availability. NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Kubernetes Cluster Components
NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Docker Engine
• Docker containers that run on
Docker Engine:
• Standard: Docker created the
industry standard for containers,
so they could be portable
anywhere
• Lightweight: Containers share the
machine’s OS system kernel and
therefore do not require an OS per
application, driving higher server NPTEL
efficiencies and reducing server
and licensing costs
• Secure: Applications are safer in
containers and Docker provides
the strongest default isolation
capabilities in the industry
Ref: https://www.docker.com/resources/what-container
Dockers
• A Docker container image is a lightweight, standalone, executable
package of software that includes everything needed to run an
application: code, runtime, system tools, system libraries and settings.
• Container images become containers at runtime and in the case of
Docker containers ‐ images become containers when they run
on Docker Engine.
• NPTEL
Available for both Linux and Windows‐based applications, containerized
software will always run the same, regardless of the infrastructure.
Containers isolate software from its environment and ensure that it
works uniformly despite differences for instance between development
and staging.
Ref: https://www.docker.com/resources/what-container
https://kubernetes.io/docs/concepts/overview/what‐is‐kubernetes/
https://www.docker.com/resources/what‐container
https://cloud.google.com/kubernetes‐
engine/docs/concepts/verticalpodautoscaler
NPTEL
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 10: Cloud Computing Paradigm
Lecture 48: Container – II (Docker)
Docker Container – Overview
Docker – Components
Docker – Architecture
NPTEL
Container
Docker Container
NPTEL
Docker
NPTEL
https://www.docker.com/
Docker ‐ Overview
● Docker is a platform that allows you to “build, ship, and run any
app, anywhere.”
● Considered to be a standard way of solving one of the challenging
aspects of software: deployment.
● Traditionally, the development pipeline typically involved
combinations of various technologies for managing the movement
of software, such as virtual machines, configuration management
tools, package management systems, NPTEL
and complex webs of library
dependencies.
● All these tools needed to be managed and maintained by specialist
engineers, and most had their own unique ways of being configured.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Overview
● Docker has changed the traditional approach ‐ Everything
goes through a common pipeline to a single output that can
be used on any target—there’s no need to continue
maintaining a bewildering array of tool configurations
● At the same time, there’s no need to throw away the
existing software stack if it worksNPTEL
for you—you can package
it up in a Docker container as‐is, for others to consume.
● Addionally, you can see how these containers were built, so
if you need to dig into the details, you can.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker – Big Picture
NPTEL
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Analogy
● Analogy: Tradionally, a docker was a laborer who moved
commercial goods into and out of ships when they docked at
ports. There were boxes and items of differing sizes and shapes,
and experienced dockers were prized for their ability to fit goods
into ships by hand in cost‐effective ways. Hiring people to move
stuff around wasn’t cheap, but there was no alternative!
● This may sound familiar to anyone working
NPTEL in software. Much time
and intellectual energy is spent getting metaphorically odd‐shaped
software into differently‐sized metaphorical ships full of other
odd‐shaped software, so they can be sold to users or businesses
elsewhere.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Benefit
● Before Docker, deploying software to different environments
required significant effort. Even if you weren’t hand‐running
scripts to provision software on different machines (and plenty of
people do exactly that), you’d still have to handle the configuration
management tools that manage state on what are increasingly
fast‐moving environments starved of resources.
● Even when these efforts were encapsulated
NPTEL in VMs, a lot of time
was spent managing the deployment of these VMs, waiting for
them to boot, and managing the overhead of resource use they
created.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Benefit
NPTEL
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Benefit
● With Docker, the configuration effort is separated from the resource
management, and the deployment effort is trivial:
run docker, and the environment’s image is pulled down and
ready to run, consuming fewer resources and contained so that it
doesn’t interfere with other environments.
NPTEL
● You don’t need to worry about whether your container is going to be
shipped to a Red Hat machine, an Ubuntu machine, or a CentOS VM
image; as long as it has Docker on it, it will run
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Advantage
● Replacing virtual machines (VMs): Docker can be used to replace
VMs in many situations. If you only care about the application, not
the operating system, Docker can replace the VM.
Not only is Docker quicker than a VM to spin up, it’s more lightweight to
move around, and due to its layered filesystem, you can more easily and
quickly share changes with others. It’s also rooted in the command line
and is scriptable.
NPTEL
● Prototyping software: If you want to quickly experiment with
software without either disrupting your existing setup or going
through the hassle of provisioning a VM, Docker can give you a
sandbox environment almost instantly. .
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Advantage
• Packaging software: Because a Docker image has effectively no
dependencies, it’s a great way to package software. You can build
your image and be sure that it can run on any modern Linux
machine—think Java, without the need for a JVM.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Advantage
• Reducing debugging overhead: Complex negotiations between
different teams about software delivered is a commonplace
within the industry.
– Docker allows you to state clearly (even in script form) the steps for
debugging a problem on a system with known properties, making bug and
environment reproduction a much simpler affair, and one normally
separated from the host environment provided.
NPTEL
• Documenting software dependencies: By building your images
in a structured way, ready to be moved to different
environments, Docker forces you to document your software
dependencies explicitly from a base starting point..
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker ‐ Advantage
• Enabling continuous delivery: Continuous delivery (CD) is a
paradigm for software delivery based on a pipeline that rebuilds
the system on every change and then delivers to production (or
“live”) through an automated (or partly automated) process.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker – Key Concepts
NPTEL
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker – Key Commands
NPTEL
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker – Architecture
• Docker on your host machine is split into two parts—a daemon
with a RESTful API and a client that talks to the daemon.
• The private Docker registry is a service that stores Docker images.
These can be requested from any Docker daemon that has the
relevant access. This registry is on an internal network and isn’t
publicly accessible, so it’s considered private.
NPTEL
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker – Architecture
• One invokes the Docker client to get information from or give instructions
to the daemon; the daemon is a server that receives requests and returns
responses from the client using the HTTP protocol.
• In turn, it will make requests to other services to send and receive
images, also using the HTTP protocol.
• The server will accept requests from the command‐line client or anyone
else authorized to connect. NPTEL
• The daemon is also responsible for taking care of your images and
containers behind the scenes, whereas the client acts as the intermediary
between you and the RESTful API.
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
Docker – Architecture
NPTEL
Ref: Docker in Practice, Second Edition, Ian Miell and Aidan Hobson Sayers, February 2019, ISBN 9781617294808
https://www.docker.com/
Docker in Practice, Second Edition, Ian Miell and Aidan Hobson
Sayers, February 2019, ISBN 9781617294808
NPTEL
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 10: Cloud Computing Paradigm
Lecture 49: Docker Container ‐ Demo (Part‐I)
Docker Container ‐ Demo
NPTEL
Container
Docker
NPTEL
Docker Demo ‐ I
NPTEL
Introduction
• Containers
• Standard unit of software
• Packages up code and all its dependencies
• Application runs quickly and reliably
• Docker container image
• Lightweight NPTEL
• Standalone
• Executable package of software
• Includes everything needed to run an application
Demo ‐ Objective
MySQL and PHPMyAdmin on Docker platform
MySQL
Widely used relational database package
Open‐source
PHPMyAdmin
A graphical user interface NPTEL
Web‐based
Connects to MySQL database
Widely used for managing MySQL databases
Standalone System (No Container)
• Separate installation for
• MySQL
• Web Server (Apache)
• PHP
• PHPMyAdmin
• NPTEL
Transferring to other machine/ system
• Separate installation
• Backup of data from old MySQL server
• Restore the backup to new MySQL server
Containers – Major Benefits
• Separation of responsibility: Containerization provides a clear separation
of responsibility, as developers focus on application logic and
dependencies, while IT operations teams can focus on deployment and
management instead of application details such as specific software
versions and configurations.
• Workload portability: Containers can run virtually anywhere, greatly
easing development and deployment: on Linux, Windows, and Mac
operating systems; on virtual machines NPTEL
or on physical servers; on a
developer’s machine or in data centers on‐premises; and of course, in
the public cloud.
• Application isolation: Containers virtualize CPU, memory, storage, and
network resources at the operating system level, providing developers
with a view of the OS logically isolated from other applications.
Ref: https://cloud.google.com/learn/what‐are‐containers
Application Deployment
NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Traditional – Virtualized – Container Deployments
• Traditional deployment : Applications run on physical servers. There was no way to
define resource boundaries for applications in a physical server, and this caused resource
allocation issues.
• Virtualized deployment : Allows to run multiple Virtual Machines (VMs) on a single
physical server's CPU. Virtualization allows applications to be isolated between VMs. It
allows better utilization of resources in a physical server and allows better scalability. Each
VM is a full machine running all the components, including its own operating system, on
top of the virtualized hardware.
• NPTEL
Container deployment: Containers are similar to VMs, but they have relaxed isolation
properties to share the Operating System (OS) among the applications. Therefore,
containers are considered lightweight. A container has its own filesystem, share of CPU,
memory, process space, and more. As containers are decoupled from the underlying
infrastructure, they are portable across clouds and different OS distributions.
Containers and VMs
• VMs: a guest operating system such as Linux or Windows runs on top of a host
operating system with access to the underlying hardware.
• Containers are often compared to virtual machines (VMs). Like virtual
machines, containers allow one to package the application together with
libraries and other dependencies, providing isolated environments for running
your software services.
• However, the containers offer a far more lightweight unit for developers and IT
NPTEL
Ops teams to work with, carrying a myriad of benefits.
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Kubernetes Components
• A Kubernetes cluster consists of a set of worker machines, called nodes, that run
containerized applications. Every cluster has at least one worker node.
• The worker node(s) host the Pods that are the components of the application
workload.
• The control plane manages the worker nodes and the Pods in the cluster.
• In production environments, the control plane usually runs across multiple
computers and a cluster usually runs multiple nodes, providing fault‐tolerance
and high availability. NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Kubernetes Cluster Components
NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Docker Engine
• Docker containers that run on
Docker Engine:
• Standard: Docker created the
industry standard for containers,
so they could be portable
anywhere
• Lightweight: Containers share the
machine’s OS system kernel and
therefore do not require an OS per
application, driving higher server NPTEL
efficiencies and reducing server
and licensing costs
• Secure: Applications are safer in
containers and Docker provides
the strongest default isolation
capabilities in the industry
Ref: https://www.docker.com/resources/what-container
Dockers
• A Docker container image is a lightweight, standalone, executable
package of software that includes everything needed to run an
application: code, runtime, system tools, system libraries and settings.
• Container images become containers at runtime and in the case of
Docker containers ‐ images become containers when they run
on Docker Engine.
• NPTEL
Available for both Linux and Windows‐based applications, containerized
software will always run the same, regardless of the infrastructure.
Containers isolate software from its environment and ensure that it
works uniformly despite differences for instance between development
and staging.
Ref: https://www.docker.com/resources/what-container
https://kubernetes.io/docs/concepts/overview/what‐is‐kubernetes/
https://www.docker.com/resources/what‐container
https://cloud.google.com/kubernetes‐
engine/docs/concepts/verticalpodautoscaler
NPTEL
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 10: Container
Lecture 50: Docker Container ‐ Demo (Part‐II)
Docker Container ‐ Demo
NPTEL
Container
Docker
NPTEL
Docker Demo ‐ II
NPTEL
Introduction
• Containers
• Standard unit of software
• Packages up code and all its dependencies
• Application runs quickly and reliably
• Docker container image
• Lightweight NPTEL
• Standalone
• Executable package of software
• Includes everything needed to run an application
Demo ‐ Objective
MySQL and PHPMyAdmin on Docker platform
MySQL
Widely used relational database package
Open‐source
PHPMyAdmin
A graphical user interface NPTEL
Web‐based
Connects to MySQL database
Widely used for managing MySQL databases
Standalone System (No Container)
• Separate installation for
• MySQL
• Web Server (Apache)
• PHP
• PHPMyAdmin
• NPTEL
Transferring to other machine/ system
• Separate installation
• Backup of data from old MySQL server
• Restore the backup to new MySQL server
Containers – Major Benefits
• Separation of responsibility: Containerization provides a clear separation
of responsibility, as developers focus on application logic and
dependencies, while IT operations teams can focus on deployment and
management instead of application details such as specific software
versions and configurations.
• Workload portability: Containers can run virtually anywhere, greatly
easing development and deployment: on Linux, Windows, and Mac
operating systems; on virtual machines NPTEL
or on physical servers; on a
developer’s machine or in data centers on‐premises; and of course, in
the public cloud.
• Application isolation: Containers virtualize CPU, memory, storage, and
network resources at the operating system level, providing developers
with a view of the OS logically isolated from other applications.
Ref: https://cloud.google.com/learn/what‐are‐containers
Application Deployment
NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Traditional – Virtualized – Container Deployments
• Traditional deployment : Applications run on physical servers. There was no way to
define resource boundaries for applications in a physical server, and this caused resource
allocation issues.
• Virtualized deployment : Allows to run multiple Virtual Machines (VMs) on a single
physical server's CPU. Virtualization allows applications to be isolated between VMs. It
allows better utilization of resources in a physical server and allows better scalability. Each
VM is a full machine running all the components, including its own operating system, on
top of the virtualized hardware.
• NPTEL
Container deployment: Containers are similar to VMs, but they have relaxed isolation
properties to share the Operating System (OS) among the applications. Therefore,
containers are considered lightweight. A container has its own filesystem, share of CPU,
memory, process space, and more. As containers are decoupled from the underlying
infrastructure, they are portable across clouds and different OS distributions.
Containers and VMs
• VMs: a guest operating system such as Linux or Windows runs on top of a host
operating system with access to the underlying hardware.
• Containers are often compared to virtual machines (VMs). Like virtual
machines, containers allow one to package the application together with
libraries and other dependencies, providing isolated environments for running
your software services.
• However, the containers offer a far more lightweight unit for developers and IT
NPTEL
Ops teams to work with, carrying a myriad of benefits.
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Kubernetes Components
• A Kubernetes cluster consists of a set of worker machines, called nodes, that run
containerized applications. Every cluster has at least one worker node.
• The worker node(s) host the Pods that are the components of the application
workload.
• The control plane manages the worker nodes and the Pods in the cluster.
• In production environments, the control plane usually runs across multiple
computers and a cluster usually runs multiple nodes, providing fault‐tolerance
and high availability. NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Kubernetes Cluster Components
NPTEL
Ref: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
Docker Engine
• Docker containers that run on
Docker Engine:
• Standard: Docker created the
industry standard for containers,
so they could be portable
anywhere
• Lightweight: Containers share the
machine’s OS system kernel and
therefore do not require an OS per
application, driving higher server NPTEL
efficiencies and reducing server
and licensing costs
• Secure: Applications are safer in
containers and Docker provides
the strongest default isolation
capabilities in the industry
Ref: https://www.docker.com/resources/what-container
Dockers
• A Docker container image is a lightweight, standalone, executable
package of software that includes everything needed to run an
application: code, runtime, system tools, system libraries and settings.
• Container images become containers at runtime and in the case of
Docker containers ‐ images become containers when they run
on Docker Engine.
• NPTEL
Available for both Linux and Windows‐based applications, containerized
software will always run the same, regardless of the infrastructure.
Containers isolate software from its environment and ensure that it
works uniformly despite differences for instance between development
and staging.
Ref: https://www.docker.com/resources/what-container
https://kubernetes.io/docs/concepts/overview/what‐is‐kubernetes/
https://www.docker.com/resources/what‐container
https://cloud.google.com/kubernetes‐
engine/docs/concepts/verticalpodautoscaler
NPTEL
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 11: Cloud Computing Paradigms
Lecture 51: Dew Computing
Dew Computing – Overview
Dew Computing – Features
Dew Computing – Architecture
Dew Computing – Applications NPTEL
Dew Computing
NPTEL
Dew Computing
NPTEL
Cloud Computing “Family”
NPTEL
Ref: Sunday Oyinlola Ogundoyin, Ismaila Adeniyi Kamil, Optimization techniques and applications in fog computing: An
exhaustive survey, Swarm and Evolutionary Computation, Elsevier
Dew Computing (DC)
● Dew computing is a computing paradigm that combines the
core concept of cloud computing with the capabilities of
end devices (personal computers, mobile phones, etc.).
● It is used to enhance the experience for the end user in
comparison to only using cloud computing.
NPTEL
● Dew computing attempts to solve one of the major
problems related to cloud computing, such as reliance on
internet access.
Dew Computing (DC)
● Dew Computing is a computing model for enabling
ubiquitous, pervasive, and convenient ready‐to‐go, plug‐in
facility empowered personal network that includes Single‐
Super‐Hybrid‐Peer P2P communication link.
● Primary goal: To access a pool of raw data equipped with
meta‐data that can be rapidly created,
NPTEL edited, stored, and
deleted with minimal internetwork management effort (i.e.
offline mode).
● To utilise all functionalities of cloud computing , Network
users are heavily dependent on Internet Connectivity all the
time.
Dew Computing (DC)
• Dew computing (DC) is a new paradigm where user‐centric, flexible, and
personalized‐supported applications are prioritized. It is located very close
to the end devices and it is the first in the IoT‐fog‐cloud continuum.
• DC is a micro‐service‐based computing paradigm with vertically distributed
hierarchy.
• DC comprises smart devices, such as smart‐phones, smart‐watches, tablets,
etc., located at the edge of the network to connect with the end devices,
collect and process the IoT sensed data, and NPTEL
offer other services.
• The services in DC are relatively available and it is not mandatory to have a
permanent Internet connection.
• DC is micro‐service based which means that it does not depend on any
centralized server or cloud data center.
• DC does not rely on the centralized computing devices nor a permanent
Internet connection.
DC – Typical Example
Dropbox is an example of the dew computing paradigm,
as it provides access to the files and folders in the cloud in
addition to keeping copies on local devices.
Allows the user to access files during times without an
internet connection; when a connection is established
NPTEL back to the cloud
again, files and folders are synchronized
server.
Ref: https://www.dropbox.com
Dew Computing ‐ Features
• Key features of dew computing are independence and
collaboration.
• Independence means that the local device must be able to provide
service without a continuous connection to the Internet.
• Collaboration means that the application must be able to connect
to the cloud service and synchronize data when appropriate.
NPTEL
• The word "dew" reflects natural phenomena: clouds are far from
the ground, fog is closer to the ground, and dew is on the ground.
Analogically, cloud computing is a remote service, fog computing is
beside the user, and dew computing is at the user end.
Dew Service Models and Typical Applications
Storage‐In‐Dew Web-in-Dew
The local device must
The storage of the local possess a duplicated
device is partially or fully fraction of the World Wide
copied into the cloud. Web (WWW).
NPTEL
Dew Computing Architecture
Dew Virtual Machine (DVM)
Single Hybrid
P2P Comm. Link
Cloud
Server
NPTEL
Dew Client Program Dew Server
IoT Devices
& Sensors Host Machine
To establish a cloud‐dew architecture on a local machine, a dew virtual
machine (DVM) is needed. The DVM is an isolated environment for
executing the dew server on the local system
DC ‐ Architecture
NPTEL
Attempt to achieve three goals:
• Data Replication
• Data Distribution
• Synchronization
DC – Application Areas
• Web in Dew (WiD) ‐ Possess a duplicated fraction of the World Wide Web
(WWW) or a modified copy of that fraction to satisfy the independence
feature. Because this fraction synchronizes with the web, it satisfies the
collaboration feature of dew computing.
• Storage in Dew (SiD) The storage of the local device is partially or fully
copied into the cloud. Since the user can access files at any time without the
need for constant Internet access, this category meets the independence
feature of dew computing. SiD also meets the collaboration feature because
NPTEL with the cloud service.
the folder and its contents automatically synchronize
• Database in Dew (DBiD): The local device and the cloud both store copies of
the same database. One of these two databases is considered the main
version and can be defined as such by the database administrator. This
service increases the reliability of a database, as one of the databases can
act as the backup for the other.
DC – Application Areas
• Software in Dew (SiD): The configuration and ownership of software are
saved in the cloud. Examples include the Apple App Store and Google Play,
where the applications the user installs are saved to their account and can
then be installed on any device linked to their account.
• Platform in Dew (PiD): A software development suite must be installed on
the local device with the settings and application data synchronized to the
cloud service. It must be able to synchronize development data, system
deployment data, and online backups. Example: GitHub.
• Infrastructure as Dew (IaD): The local deviceNPTEL
is dynamically supported by
cloud services. IaD can come in different forms, but the following two forms
can be used: (1) the local device can have an exact duplicate DVM instance
in the cloud, which is always kept in the same state as the local instance, or
(2) the local device can have all its settings/data saved in the cloud,
including system settings/data and data for each application
DC – Challenges
• Power Management
• Processor Utility
• Data Storage
• Viability of Operating System
• Programming Principles
• Database Security NPTEL
Cloud Computing and Dew
NPTEL
Dew‐enabled
Computing
https://en.wikipedia.org/wiki/Dew_computing
Wang, Yingwei (2016). "Definition and Categorization of Dew Computing". Open
Journal of Cloud Computing. 3 (1). ISSN 2199‐1987.
"Dew Computing and Transition of Internet Computing Paradigms” ‐ ZTE
Corporation
Yingwei, Wang (2015). "The initial definition of dew computing". Dew Computing
Research.
NPTEL
Ray, Partha Pratim (2018). "An Introduction to Dew Computing: Definition,
Concept and Implications ‐ IEEE Journals & Magazine". IEEE Access. 6: 723–737.
doi:10.1109/ACCESS.2017.2775042.
Sunday Oyinlola Ogundoyin, Ismaila Adeniyi Kamil, Optimization techniques and
applications in fog computing: An exhaustive survey, Swarm and Evolutionary
Computation, Elsevier, Volume 66, 2021,
https://doi.org/10.1016/j.swevo.2021.100937
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 11: Cloud Computing Paradigms
Lecture 52: Serverless Computing ‐ I
Serverless Computing
Function‐as‐a‐Service
NPTEL
Serverless Computing
Function‐as‐a‐Service
NPTEL
Serverless Computing ‐ I
NPTEL
Serverless Computing
● Serverless computing is a method of providing backend
services on an as‐used basis. A serverless provider allows
users to write and deploy code without the hassle of
worrying about the underlying infrastructure.
● Serverless architecture simplifies the code deployment and
eliminates the need for system administration,
NPTEL allowing
developers to focus on the core logic without creating
additional overhead by instantiating resources, such as VMs
or containers in the monitoring infrastructure.
Serverless Computing
• In this model, developers execute their logic in the form of
functions and submit to the cloud provider to run the task in a
shared runtime environment; cloud providers manage the
scalability needs of the function, by running multiple functions in
parallel.
• Following the wide scale application of the containerization
NPTEL
approach, the cloud services have adapted to offer better‐fitting
containers that require less time to load (boot) and to provide
increased automation in handling (orchestration) containers on
behalf of the client.
• Serverless computing promises to achieve full automation in
managing fine‐grained containers.
Serverless Computing
• “Serverless computing is a form of cloud computing that allows users to
run event‐driven and granular applications, without having to address
the operational logic”
• Serverless as a computing abstraction: With serverless, developers focus
on high‐level abstractions (e.g., functions, queries, and events) and build
applications that infrastructure operators map to concrete resources and
supporting services.
• Developers focusing on the business logicNPTEL
and on ways to interconnect
elements of business logic into complex workflows.
• Service providers ensure that the serverless applications are
orchestrated—that is, containerized, deployed, provisioned, and
available on demand—while billing the user for only the resources used.
Function‐as‐a‐Service
• Clients of serverless computing can use the function‐as‐a‐service (FaaS)
model
• Function as a service (FaaS) is a form of serverless computing in which the
cloud provider manages the resources, lifecycle, and event‐driven execution of
user‐provided functions.
• With FaaS, users provide small, stateless functions to the cloud provider,
which manages all the operational aspects to run these functions.
• NPTEL
For example, consider the ExCamera application, which uses cloud functions
and workflows to edit, transform, and encode videos with low latency and
cost.
[Ref: S. Fouladi et al., “Encoding, fast and slow: Low‐latency video processing using thousands of tiny threads,” Proceedings
of the 14th USENIX Conference on Networked Systems Design and Implementation (NSDI 17), 2017, pp. 363–376]
NPTEL
Serverless Computing
• In serverless, the cloud provider dynamically
allocates and provisions servers.
• The code is executed in almost‐stateless containers
that are event‐triggered, and ephemeral (may last
for one invocation).
NPTEL
• Serverless covers a wide range of technologies,
that can be grouped into two categories:
– Backend‐as‐a‐Service (BaaS)
– Functions‐as‐a‐Service (FaaS)
Backend‐as‐a‐Service (BaaS)
• BaaS enables to replace server‐side components with off‐
the‐shelf services.
• BaaS enables developers to outsource all the aspects behind
a scene of an application so that developers can choose to
write and maintain all application logic in the frontend.
• Typical examples: remote authentication systems, database
NPTEL
management, cloud storage, and hosting.
• Google Firebase, a fully managed database that can be
directly used from an application.
• In this case, Firebase (the BaaS services) manage data
components on the user’s behalf.
Function‐as‐a‐Service (FaaS)
• Serverless applications are event‐driven cloud‐based systems where
application development relies solely on a combination of third‐party
services, client‐side logic, and cloud‐hosted remote procedure calls.
• FaaS allows developers to deploy code that, upon being triggered, is
executed in an isolated environment.
• Each function typically describe a small part of an entire application. The
execution time of functions is typically limited.
• NPTEL
Functions are not constantly active. Instead, the FaaS platforms listen for
events that instantiate the functions.
• Thus, functions are triggered by events, such as client requests, events
produced by any external systems, data streams, or others.
• FaaS provider is then responsible to horizontally scale function executions in
response to the number of incoming events.
Serverless Computing ‐ Challenges
• Asynchronous calls:
– Asynchronous calls to and between Serverless Functions increase
complexity of the system. Usually remote API calls follow request
response model and are easier to implemented with synchronous calls.
• Functions calling other functions
– Complex debugging, loose isolation of features. Extra costs if functions
are called synchronously as we need to pay for two functions running at
the same time. NPTEL
• Shared code between functions
– Might break existing Serverless Functions that depend on the shared
code that is changed. Risk to hit the image size limit (50MB in AWS
Lambda), warmup‐time (the bigger the image, the longer it takes to
start).
Serverless Computing ‐ Challenges
• Usage of too many libraries
– Increased space used by the libraries increase the risk to hit the image
size limit and increase the warmup‐time.
• Adoption of too many technologies
– such as libraries, frameworks, languages.
– Adds maintenance complexity and increases skill requirements for
people working within the project.
• Too many functions
NPTEL
– Creation of functions without reusing the existing one. Non‐active
Serverless Functions doesn’t cost anything so there is temptation to
create new functions instead of altering existing functionality to match
changed requirements.
– Decreased maintainability and lower system understandability.
Serverless Computing – Major Providers
NPTEL
Sanghyun Hong and Abhinav Srivastava and William Shambrook and Tudor
Dumitras, Go Serverless: Securing Cloud via Serverless Design Patterns, 10th
USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 18), 2018,
https://www.usenix.org/conference/hotcloud18/presentation/hong
E. van Eyk, L. Toader, S. Talluri, L. Versluis, A. Uță and A. Iosup, "Serverless is More:
From PaaS to Present Cloud Computing," in IEEE Internet Computing, vol. 22, no.
NPTEL
5, pp. 8‐17, Sep./Oct. 2018, doi: 10.1109/MIC.2018.053681358.
J. Nupponen and D. Taibi, "Serverless: What it Is, What to Do and What Not to
Do," 2020 IEEE International Conference on Software Architecture Companion
(ICSA‐C), 2020, pp. 49‐50, doi: 10.1109/ICSA‐C50368.2020.00016.
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 11: Cloud Computing Paradigms
Lecture 53: Serverless Computing ‐ II
Serverless Computing
AWS Lambda
Google Cloud Functions
Azure Functions NPTEL
Serverless Computing
AWS Lambda
Google Cloud Functions
Azure Functions
NPTEL
Serverless Computing - II
NPTEL
Serverless Computing
● Serverless computing hides the servers by providing
programming abstractions for application builders that
simplify cloud development, making cloud software easier
to write.
● The focus/ target of Cloud Computing was system
NPTEL
administrators and the Serverless is programmers. This
change requires cloud providers to take over many of the
operational responsibilities needed to run applications.
Serverless Computing
● To emphasize the change of focus from servers to
applications, this new paradigm is known as serverless
computing, although remote servers are still the invisible
backend that powers it.
● This next phase of cloud computing will change the way
programmers work as dramatically as the Cloud Computing
NPTEL
has changed how operators work.
● Thus, Serverless applications are ones that don’t need any
server provision and do not require to manage servers.
Serverless Computing – Major Providers
NPTEL
AWS Lambda
● AWS Lambda is an event‐driven, serverless computing platform
provided by Amazon as a part of Amazon Web Services.
● Thus one need to worry about which AWS resources to launch, or
how to manage them. Instead, you need to put the code on
Lambda, and it runs.
● In AWS Lambda the code is executed based on the response of
NPTEL
events in AWS services such as add/delete files in S3 bucket, HTTP
request from Amazon API gateway, etc.
● However, Amazon Lambda can only be used to execute
background tasks.
AWS Lambda
● AWS Lambda function helps you to focus on your core product and
business logic instead of managing operating system (OS) access
control, OS patching, right‐sizing, provisioning, scaling, etc.
● AWS Lambda Block Diagram:
NPTEL
AWS Lambda
1) First upload your AWS Lambda code in any language supported by AWS
NPTEL
Lambda. Java, Python, Go, and C# are some of the languages that are
supported by AWS Lambda function.
2) These are some AWS services which allow you to trigger AWS Lambda.
3) AWS Lambda helps you to upload code and the event details on which it
should be triggered.
4) Executes AWS Lambda Code when it is triggered by AWS services
5) AWS charges only when the AWS lambda code executes, and not otherwise.
AWS Lambda Concepts
• Function: A function is a program or a script which runs in AWS Lambda. Lambda
passes invocation events into your function, which processes an event and
returns its response.
• Runtimes: Runtime allows functions in various languages which runs on the same
base execution environment. This helps in configuring your function in runtime. It
also matches your selected programming language.
• Event source: An event source is an AWS service, such as Amazon SNS (Simple
Notification Service), or a custom service. This triggers function helps you to
executes its logic. NPTEL
• Lambda Layers: Lambda layers are an important distribution mechanism for
libraries, custom runtimes, and other important function dependencies.
• Log streams: Log stream allows you to annotate your function code with custom
logging statements which helps you to analyse the execution flow and
performance of your AWS Lambda functions.
AWS Lambda – How to execute code
1) AWS Lambda URL: https://aws.amazon.com/lambda/
2) Create an Account or use Existing Account
Edit the code & Click Run…
1. Edit the code
2. Click Run
NPTEL
3) Check output
Google Cloud Functions
• Google Cloud Functions is a serverless execution
environment for building and connecting cloud services.
• With Cloud Functions you write simple, single‐purpose
functions that are attached to events emitted from your
cloud infrastructure and services.
• Cloud Function is triggered when an event being watched is
NPTEL
fired.
• The code executes in a fully managed environment. There
is no need to provision any infrastructure or worry about
managing any servers.
Google Cloud Functions ‐ Working
Cloud Services. This is the Google Cloud Platform and its various
NPTEL
services. Services like: Google Cloud Storage, Google Cloud Pub/Sub,
Stackdriver, Cloud Datastore, etc. They all have events that happen
inside of them. For e.g. if a bucket has got a new object uploaded into
it, deleted from it or metadata has been updated.
Google Cloud Functions ‐ Working
Cloud Functions:
• Say an event (e.g. Object uploaded into a Bucket in Cloud Storage
happens) is generated or fired or emitted. The Event data associated
with that event has information on that event.
• If the Cloud Function is configured to be triggered by that event,
then the Cloud Function is invoked or run or executed.
• As part of its execution, the event data is passed to it, so that it can
decipher what has caused the event i.e.NPTEL
the event source, get meta
information on the event and so on and do its processing.
• As part of the processing, it might also (maybe) invoke other APIs.
(Google APIs or external APIs).
• It could even write back to the Cloud Services.
Google Cloud Functions ‐ Working
• When it has finished executing its logic, the Cloud Function
mentions or specifies that it is done.
• Multiple Event occurrences will result in multiple invocations of your
Cloud Functions. This is all handled for you by the Cloud Functions
infrastructure. You focus on your logic inside the function and be a
good citizen by keeping your function single purpose, use minimal
execution time and indicate early enough that you are done and
don’t end up in a timeout. NPTEL
• This should also indicate to you that this model works best in a
stateless fashion and hence you cannot depend on any state that
was created as part of an earlier invocation of your function. You
could maintain state outside of this framework
Google Cloud Functions ‐ Events, Triggers
• Events : They occur in Google Cloud Platform Services E.g. File Uploaded to
Storage, a Message published to a queue, Direct HTTP Invocation, etc.
• Triggers : You can chose to respond to events via a Trigger. A Trigger is the
event + data associated the event.
• Event Data : This is the data that is passed to your Cloud Function when the
event trigger results in your function execution.
NPTEL
Google Cloud Functions –Event Providers
• HTTP — invoke functions directly via HTTP requests
• Cloud Storage
• Cloud Pub/Sub
• Firebase (DB, Analytics, Auth)
• Stackdriver Logging
NPTEL
• Cloud Firestore
• Google Compute Engine
• BigQuery
Azure Functions
• Azure Functions is a serverless solution that allows you to write less
code, maintain less infrastructure, and save on costs. Instead of worrying
about deploying and maintaining servers, the cloud infrastructure
provides all the up‐to‐date resources needed to keep your applications
running.
• User focuses on the pieces of code, and Azure Functions handles the
rest.
• A function is the primary concept in AzureNPTEL
Functions.
• A function contains two important pieces ‐ your code, which can be
written in a variety of languages, and some configurations, the
function.json file.
• For compiled languages, this config file is generated automatically from
annotations in your code. For scripting languages, you must provide the
config file yourself.
Azure Functions – Build your Functions
Options and resources :
NPTEL
Sustainable Cloud Computing
Cloud Data Centre
Energy Management
Carbon Footprint
NPTEL
Sustainable Cloud Computing ‐ I
NPTEL
Sustainable Cloud Computing
● Cloud Service Providers (CSPs) rely heavily on the Cloud
Data Centers (CDCs) to support the ever‐increasing demand
for their computational and application services.
● The financial and carbon footprint related costs of running
such large infrastructure negatively impacts the
sustainability of cloud services. NPTEL
● Focus on minimizing the energy consumption and carbon
footprints and ensuring reliability of the CDCs – goal of
Sustainable Cloud Computing
Sustainable Cloud Computing
• Cloud computing paradigm offers on‐demand, subscription‐
oriented services over the Internet to host applications and
process user workloads.
• To ensure the availability and reliability of the services, the
components of Cloud Data Centers (CDCs), such as network
devices, storage devices and servers are to be made available
round‐the‐clock. NPTEL
• However, creating, processing, and storing each bit of data
adds to the energy cost, increases carbon footprints, and
further impacts the environment.
Sustainable Cloud Computing
NPTEL
Energy Consumption in Cloud Datacenters
Ref: (1) Rajkumar Buyya and Sukhpal Singh Gill. “Sustainable Cloud Computing: Foundations and Future Directions.” Business Technology &
Digital Transformation Strategies, Cutter Consortium, Vol. 21, no. 6, Pages 1‐9, 2018; (2) Anders SG Andrae, and Tomas Edler. “On global
electricity usage of communication technology: trends to 2030.” Challenges, vol. 6, no. 1, pp. 117‐157, 2015.
Sustainable Cloud Computing
• Components (networks, storage, memory and cooling
systems) of CDCs are consuming huge amount of energy.
• To improve energy efficiency of CDC, there is a need for
energy‐aware resource management technique for
management of all the resources (including servers, storage,
memory, networks, and cooling systems) in a holistic manner.
NPTEL
• Due to the under‐loading/ over‐loading of infrastructure
resources, the energy consumption in CDCs is not efficient; in
fact, most of the energy is consumed while some resources
(i.e., networks, storage, memory, processor) are in idle state,
increasing the overall cost of cloud services.
Sustainable Cloud Computing
• CSPs are finding other alternative ways to reduce carbon
footprints of their CDCs
• Major CSPs (like Google, Amazon, Microsoft and IBM) are
planning to power their datacenters using renewable energy
sources.
• Future CDCs are required to provide cloud services with
NPTEL
minimum emissions of carbon footprints and heat release in
the form of greenhouse gas emissions.
Sustainable Cloud Computing
To enable sustainable cloud computing, datacenters can be
relocated based on:
opportunities for waste heat recovery
accessibility of green resource and
proximity of free cooling resources
NPTEL
To resolve these issues and substantially reduce energy
consumption of CDCs, there is a need for cloud computing
architectures that can provide sustainable cloud services
through holistic management of resources.
Sustainable Cloud Computing
NPTEL
Reliability
• To identify system failures and their reasonsNPTEL
to manage the risks
• To reduce SLA violation and service delay
• To protect critical information from security attacks
• To make point to point communication using encryption and decryption
• To provide secure VM migration mechansim
• To improves capability of the system
• To reduce Turn of Investment (ToI)
Implication of Reliability on Sustainability
• Improving energy utilization, which reduces electricity bills and operational
costs to enables sustainable cloud computing.
• However, to provide reliable cloud services, the business operations of
different cloud providers are replicating services, which needs additional
resources and increases energy consumption.
• Thus, a trade‐off between energy consumption and reliability is required to
provide cost‐efficient cloud services.
• NPTEL
Existing energy efficient resource management techniques consume a huge
amount of energy while executing workloads, which decreases resources
leased from cloud datacenters.
• Dynamic Voltage and Frequency Scaling (DVFS) based energy management
techniques reduced energy consumption, but response time and service
delay are increased due to the switching of resources between high scaling
and low scaling modes.
Implication of Reliability on Sustainability
• Reliability of the system component is also affected by excessive
turning on/off servers.
• Power modulation decreases the reliability of server components
like storage devices, memory etc.
• By reducing energy consumption of CDCs, we can improve the
resource utilization, reliability and performance of the server.
• There is a need of new energy‐awareNPTEL resource management
techniques to reduce power consumption without affecting the
reliability of cloud services.
Sustainable Cloud Computing – Components
NPTEL
Ref: Rajkumar Buyya and Sukhpal Singh Gill. “Sustainable Cloud Computing: Foundations and Future Directions.” Business Technology & Digital
Transformation Strategies, Cutter Consortium, Vol. 21, no. 6, Pages 1‐9, 2018;
Sustainable Cloud Computing – Components
• Application Model:
– In sustainable cloud computing, the application model plays a vital role and the
efficient structure of an application can improve the energy efficiency of cloud
datacenters.
– Applications models can be data parallel, function parallel and message passing.
• Resources Targeted in Energy Management:
– Energy consumption of processor, memory, storage, network and cooling of cloud
datacenters is typically reported as 45%, 15%, 10%, 10% and 20% respectively
– Power regulation approaches increase energy NPTELconsumption during workload
execution, which affects the resource utilization of CDCs.
– DVFS attempts to solve the problem of resource utilization but switching of
resources between high scaling and low scaling modes increases response time
and service delay, which may violate the SLA.
– Putting servers in sleeping mode or turning on/off servers may affects the
availability/ reliability of the system components.
– Thus improving energy efficiency of cloud datacenters affects the resource
utilization, reliability and performance of the server.
Sustainable Cloud Computing – Components
• Thermal‐aware Scheduling
– Components of thermal‐aware scheduling are architecture and scheduling mechanisms.
Architecture can be single‐core or multi‐core while scheduling mechanism can be reactive or
proactive.
– Heating problem during execution of workloads reduces the efficiency of cloud datacenters. To
solve the heating problem of CDCs, thermal‐aware scheduling is designed to minimize the
cooling set‐point temperature, hotspots and thermal gradient
– Existing thermal‐aware techniques focused on reducing Power Usage Efficiency (PUE) can be
found, but a reduction in PUE may not reduce the Total Cost of Ownership (TCO).
• Virtualization NPTEL
– During the execution of workloads, VM migration is performed to balance the load effectively
to utilize renewable energy resources in decentralized CDCs.
– Due to the lack of on‐site renewable energy, the workloads to the other machines distributed
geographically.
– VM technology also offers migration of workloads from renewable energy based cloud
datacenters to the cloud datacenters utilizing the waste heat at another site.
– To balance the workload demand and renewable energy, VM based workload migration and
consolidation techniques provide virtual resources using few physical servers.
Sustainable Cloud Computing – Components
• Capacity Planning
– Cloud service providers must involve an effective and organized capacity planning to attain the
expected return‐on‐investment (ROI). The capacity planning can be done for power
infrastructure, IT resources and workloads.
– There is a need to consider important utilization parameters per application to maximize the
utilization of resources through virtualization by finding the applications, which can be merged.
Merging of applications improves resource utilization and reduces capacity cost.
– For efficient capacity planning, cloud workloads should be analysed before execution to finish
its execution for deadline‐oriented workloads.
– There is also a need of effective capacity planning for data storage and their processing
effectively at lower cost. NPTEL
• Renewable Energy
– Renewable energy source (e.g. solar or wind), the energy storage device and the location (off‐
site or on‐site) are important factors, which can be optimized. Carbon Usage Efficiency (CUE)
can be reduced by adding more renewable energy resources.
– Major challenges of renewable energy are unpredictability and high capital.
– Workload migration and energy‐aware load balancing techniques attempt to address the issue
of unpredictability in supply of renewable energy
– Cloud datacenters are required to place nearer the renewable energy sources to make cost
effective.
Sustainable Cloud Computing – Components
• Waste Heat Utilization
– The cooling mechanism and heat transfer model plays an important role to utilize
waste heat effectively.
– Due to consumption of large amounts of energy, CDCs are acting as a heat
generator. The vapor‐absorption based cooling systems of CDCs can use waste
heat then it utilizes the heat while evaporating.
– Vapor‐absorption based free cooling techniques may help in reducing the cooling
expenses. The energy efficiency of CDCs can be improved by reducing the energy
usage in cooling.
NPTEL
Sustainable Cloud Computing
• The ever‐increasing demand for cloud computing services that are
deployed across multiple cloud datacenters harnesses significant
amount of power, resulting in not only high operational cost but
also high carbon emissions
• The next generation of cloud computing must be energy efficient
and sustainable to fulfill end‐user requirements
• In sustainable cloud computing, the CDCs are powered by
NPTEL
renewable energy resources by replacing the conventional fossil
fuel‐based grid electricity or brown energy to effectively reduce
carbon emissions
• Sustainability with high performance and reliability is one of the
primary goals
Rajkumar Buyya and Sukhpal Singh Gill. “Sustainable Cloud Computing:
Foundations and Future Directions.” Business Technology & Digital Transformation
Strategies, Cutter Consortium, Vol. 21, no. 6, Pages 1‐9, 2018;
Anders SG Andrae, and Tomas Edler. “On global electricity usage of
communication technology: trends to 2030.” Challenges, vol. 6, no. 1, pp. 117‐
157, 2015.
Zhenhua Liu, Yuan Chen, Cullen Bash,NPTEL Adam Wierman, Daniel Gmach, Zhikui
Wang, Manish Marwah, and Chris Hyser. “Renewable and cooling aware workload
management for sustainable datacenters.” ACM SIGMETRICS Performance
Evaluation Review, vol. 40, no. 1, pp. 175‐186, 2012.
Sukhpal Singh Gill and Rajkumar Buyya. 2018. A Taxonomy and Future Directions
for Sustainable Cloud Computing: 360 Degree View. ACM Comput. Surv. 51, 5,
Article 104 (December 2018), 33 pages.
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 11: Cloud Computing Paradigms
Lecture 55: Sustainable Cloud Computing ‐ II
Sustainable Computing
Sustainable Cloud Computing
NPTEL
Sustainable Cloud Computing
Sustainable Cloud Computing ‐ Taxonomy
Energy Management
Carbon Footprint
NPTEL
Sustainable Cloud Computing - II
NPTEL
Sustainable Cloud Computing
● Cloud Service Providers (CSPs) rely heavily on the Cloud
Data Centers (CDCs) to support the ever‐increasing demand
for their computational and application services.
● The financial and carbon footprint related costs of running
such large infrastructure negatively impacts the
sustainability of cloud services. NPTEL
● Focus on minimizing the energy consumption and carbon
footprints and ensuring reliability of the CDCs – goal of
Sustainable Cloud Computing
Sustainable Cloud Computing
NPTEL
NPTEL
Ref: Rajkumar Buyya and Sukhpal Singh Gill. “Sustainable Cloud Computing: Foundations and Future Directions.” Business Technology & Digital
Transformation Strategies, Cutter Consortium, Vol. 21, no. 6, Pages 1‐9, 2018;
Sustainable Cloud Computing – Components
• Application Model:
– In sustainable cloud computing, the application model plays a vital role and the
efficient structure of an application can improve the energy efficiency of cloud
datacenters.
– Applications models can be data parallel, function parallel and message passing.
• Resources Targeted in Energy Management:
– Energy consumption of processor, memory, storage, network and cooling of cloud
datacenters is typically reported as 45%, 15%, 10%, 10% and 20% respectively
– Power regulation approaches increase energy NPTELconsumption during workload
execution, which affects the resource utilization of CDCs.
– DVFS attempts to solve the problem of resource utilization but switching of
resources between high scaling and low scaling modes increases response time
and service delay, which may violate the SLA.
– Putting servers in sleeping mode or turning on/off servers may affects the
availability/ reliability of the system components.
– Thus improving energy efficiency of cloud datacenters affects the resource
utilization, reliability and performance of the server.
Sustainable Cloud Computing – Components
• Thermal‐aware Scheduling
– Components of thermal‐aware scheduling are architecture and scheduling mechanisms.
Architecture can be single‐core or multi‐core while scheduling mechanism can be reactive or
proactive.
– Heating problem during execution of workloads reduces the efficiency of cloud datacenters. To
solve the heating problem of CDCs, thermal‐aware scheduling is designed to minimize the
cooling set‐point temperature, hotspots and thermal gradient
– Existing thermal‐aware techniques focused on reducing Power Usage Efficiency (PUE) can be
found, but a reduction in PUE may not reduce the Total Cost of Ownership (TCO).
• Virtualization NPTEL
– During the execution of workloads, VM migration is performed to balance the load effectively
to utilize renewable energy resources in decentralized CDCs.
– Due to the lack of on‐site renewable energy, the workloads to the other machines distributed
geographically.
– VM technology also offers migration of workloads from renewable energy based cloud
datacenters to the cloud datacenters utilizing the waste heat at another site.
– To balance the workload demand and renewable energy, VM based workload migration and
consolidation techniques provide virtual resources using few physical servers.
Sustainable Cloud Computing – Components
• Capacity Planning
– Cloud service providers must involve an effective and organized capacity planning to attain the
expected return‐on‐investment (ROI). The capacity planning can be done for power
infrastructure, IT resources and workloads.
– There is a need to consider important utilization parameters per application to maximize the
utilization of resources through virtualization by finding the applications, which can be merged.
Merging of applications improves resource utilization and reduces capacity cost.
– For efficient capacity planning, cloud workloads should be analysed before execution to finish
its execution for deadline‐oriented workloads.
– There is also a need of effective capacity planning for data storage and their processing
effectively at lower cost. NPTEL
• Renewable Energy
– Renewable energy source (e.g. solar or wind), the energy storage device and the location (off‐
site or on‐site) are important factors, which can be optimized. Carbon Usage Efficiency (CUE)
can be reduced by adding more renewable energy resources.
– Major challenges of renewable energy are unpredictability and high capital.
– Workload migration and energy‐aware load balancing techniques attempt to address the issue
of unpredictability in supply of renewable energy
– Cloud datacenters are required to place nearer the renewable energy sources to make cost
effective.
Sustainable Cloud Computing – Components
• Waste Heat Utilization
– The cooling mechanism and heat transfer model plays an important role to utilize
waste heat effectively.
– Due to consumption of large amounts of energy, CDCs are acting as a heat
generator. The vapor‐absorption based cooling systems of CDCs can use waste
heat then it utilizes the heat while evaporating.
– Vapor‐absorption based free cooling techniques may help in reducing the cooling
expenses. The energy efficiency of CDCs can be improved by reducing the energy
usage in cooling.
NPTEL
Sustainable Cloud Computing – Taxonomy
• With huge growth of Internet of Things (IoT)–based applications, the
use of cloud services is increasing exponentially.
• Thus, cloud computing must be energy efficient and sustainable to
fulfill the ever‐increasing end‐user needs.
• Research initiatives on sustainable cloud computing can be categorized
as follows:
– application design
– sustainability metrics
– capacity planning NPTEL
– energy management
– Virtualization
– thermal‐aware scheduling
– cooling management
– renewable energy
– waste heat utilization
Ref: Sukhpal Singh Gill and Rajkumar Buyya. 2018. A Taxonomy and Future Directions for Sustainable Cloud Computing: 360 Degree
View. ACM Comput. Surv. 51, 5, Article 104 (December 2018), 33 pages. https://doi.org/10.1145/3241038
Application Design
NPTEL
• Design of an application plays a vital role and the efficientvstructure of an
application can improve energy efficiency of CDCs.
• The resource manager and scheduler follow different approaches for
application modelling
• To make the infrastructure sustainable and environmentally eco‐friendly,
there is a need for green ICT‐based innovative applications
Sustainability Metrics
NPTEL
Capacity Planning
•
NPTEL
CSPs must initiate effective and organized capacity planning to enable
sustainable computing.
• Capacity planning can be done for power infrastructure, IT
infrastructure, and cooling mechanism.
Energy Management
NPTEL
• Energy management in sustainable computing is an important factor for CSPs
• Improving energy use reduces electricity bills and operational costs to enable
sustainable cloud computing.
• Essential requirements of sustainable CDCs are optimal software system design,
optimized air ventilation, and installing temperature monitoring tools for
adequate resource utilization, which improves energy efficiency
Virtualization
• NPTEL
CDCs consist of a chassis and racks to place the servers to process the IT
workloads.
• To maintain the temperature of datacenters, cooling mechanisms are
needed.
• Servers produce heat during execution of IT workload. The processor is an
important component of a server and consumes the most electricity.
• Both cooling and computing mechanisms consume a huge amount of
electricity. Proper management is needed.
Cooling Management
NPTEL
• The increasing demand for computation, networking, and storage expands the
complexity, size, and energy density of CDCs exponentially, which consumes a
large amount of energy and produces a huge amount of heat.
• To make CDCs more energy efficient and sustainable, we need an effective cooling
management system, which can maintain the temperature of CDCs
Renewable Energy
NPTEL
• Sustainable computing needs energy‐efficient workload execution by using
renewable energy resources to reduce carbon emissions
• Green energy resources, such as solar, wind, and water generate energy
with nearly zero carbon‐dioxide emissions
Waste Heat Utilization
NPTEL
• Reuse of waste heat is becoming a solution for fulfilling energy demand in
energy conservation systems
• The vapor‐absorption‐based cooling systems can use waste heat, and
remove the heat while evaporating.
• Vapor‐absorption‐based free cooling mechanisms can make the value of
PUE (Power Usage Effectiveness) ideal by neutralizing cooling expenses.
Sustainable Cloud Computing
• The ever‐increasing demand for cloud computing services that are
deployed across multiple cloud datacenters harnesses significant
amount of power, resulting in not only high operational cost but
also high carbon emissions
• The next generation of cloud computing must be energy efficient
and sustainable to fulfill end‐user requirements
• In sustainable cloud computing, the CDCs are powered by
NPTEL
renewable energy resources by replacing the conventional fossil
fuel‐based grid electricity or brown energy to effectively reduce
carbon emissions
• Sustainability with high performance and reliability is one of the
primary goals
Rajkumar Buyya and Sukhpal Singh Gill. “Sustainable Cloud Computing: Foundations and
Future Directions.” Business Technology & Digital Transformation Strategies, Cutter
Consortium, Vol. 21, no. 6, Pages 1‐9, 2018
Sukhpal Singh Gill and Rajkumar Buyya. 2018. A Taxonomy and Future Directions for
Sustainable Cloud Computing: 360 Degree View. ACM Comput. Surv. 51, 5, Article 104
(December 2018), 33 pages.
Anders SG Andrae, and Tomas Edler. “On global electricity usage of communication
technology: trends to 2030.” Challenges, vol. 6, no. 1, pp. 117‐157, 2015.
Zhenhua Liu, Yuan Chen, Cullen Bash, Adam NPTELWierman, Daniel Gmach, Zhikui Wang, Manish
Marwah, and Chris Hyser. “Renewable and cooling aware workload management for
sustainable datacenters.” ACM SIGMETRICS Performance Evaluation Review, vol. 40, no. 1, pp.
175‐186, 2012.
Sukhpal Singh Gill, Inderveer Chana, Maninder Singh and Rajkumar Buyya. 2018. RADAR: Self‐
Configuring and Self‐Healing in Resource Management for Enhancing Quality of Cloud
Services, Concurrency and Computation: Practice and Experience (CCPE), 2018.
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 12: Cloud Computing Paradigms
Lecture 56: Cloud Computing in 5G Era
5G Network
Cloud Computing in 5G
NPTEL
Spatial Data
Spatial Cloud Computing
NPTEL
Cloud Computing in 5G
NPTEL
5G Network
• 5G is the 5th generation mobile network. It is a new global wireless
standard after 1G, 2G, 3G, and 4G networks. 5G enables a new kind of
network that is designed to connect virtually everyone and everything
together including machines, objects, and devices.
• 5G wireless technology is meant to deliver higher multi‐Gbps peak data
speeds, ultra low latency, more reliability, massive network capacity,
increased availability, and a more uniform user experience to more
users. Higher performance and improvedNPTEL efficiency empower new user
experiences and connects new industries.
Different Generations
• First generation ‐ 1G ‐ 1980s: 1G delivered analog voice.
• Second generation ‐ 2G ‐ Early 1990s: 2G introduced digital voice
(e.g. CDMA‐ Code Division Multiple Access).
• Third generation ‐ 3G ‐ Early 2000s: 3G brought mobile data (e.g.
CDMA2000).
• Fourth generation ‐ 4G LTE ‐ 2010s: 4G LTE ushered in the era of mobile
broadband.
• 1G, 2G, 3G, and 4G all led to 5G, which is designed to provide more
connectivity than was ever available before.NPTEL
• 5G is a unified, more capable air interface. It has been designed with an
extended capacity to enable next‐generation user experiences, empower
new deployment models and deliver new services.
• With high speeds, superior reliability and negligible latency, 5G is all set to
expand the mobile ecosystem into new realms.
• 5G will impact Cloud Computing paradigm in a big way.
Evolution of Mobile Networks
NPTEL
4G vs 5G Features
NPTEL
Use of 5G
• 5G is that it is designed for forward compatibility—the ability to flexibly
support future services.
• 5G is used across three main types of connected services.
• Enhanced mobile broadband
In addition to making our smartphones better, 5G mobile technology can
usher in new immersive experiences such as VR and AR with faster, more
uniform data rates, lower latency, and lower cost‐per‐bit.
• Mission‐critical communications
5G can enable new services that can transformNPTELindustries with ultra‐reliable,
available, low‐latency links like remote control of critical infrastructure,
vehicles, and medical procedures.
• Massive IoT
5G is meant to seamlessly connect a massive number of embedded sensors
in virtually everything through the ability to scale down in data rates, power,
and mobility—providing extremely lean and low‐cost connectivity solutions.
5G Network ‐ Features
• Enhanced mobile broadband (eMBB) – enhanced indoor and
outdoor broadband, enterprise collaboration, augmented and virtual
reality.
• Massive machine‐type communications (mMTC) – IoT, asset tracking,
smart agriculture, smart cities, energy monitoring, smart home,
remote monitoring.
NPTEL
• Ultra‐reliable and low‐latency communications (URLLC) –
autonomous vehicles, smart grids, remote patient monitoring and
telehealth, industrial automation.
5G and Cloud Computing
• 5G is the perfect companion to cloud computing both in terms of its
distribution and the diversity of compute and storage capabilities.
• On‐premises and edge data centers will continue to close the gap between
resource‐constrained low‐latency devices and distant cloud data centers,
leading to driving the need for heterogeneous and distributed computing
architectures.
• In this evolving computing paradigm, service providers should look to
provide full end‐to‐end orchestration, with defined service layer
agreements, in a self‐service and automated NPTEL
way.
• Network as a Platform for enterprise services
• Service orchestration will play a key role moving forward, enabling industrial
applications to interact with the network resources in advanced ways such
as selecting location, quality of service, or influencing the traffic routing to
deliver on application demands.
5G and Cloud Computing
• Two key aspects in the relationship between 5G technologies
and cloud computing.
– First, further development of cloud computing has to meet the
5G needs. This is reflected by growing roles of edge, mobile
edge, and fog computing in the cloud computing realm.
– Second aspect is that 5G technologies are undergoing
“cloudification” through networkNPTEL
“softwarization”, NFV, SDN,
etc.
– Both technology types influence the developments of each
other.
• 5G deployments bring up discussions about the convergence of
computing, cloud, and IoT that takes us to the era of hyper‐
connectivity.
Edge Computing in 5G
• 5G is the next generation cellular network that aspires to achieve
substantial improvement on quality of service, such as higher
throughput and lower latency.
• Edge computing is an emerging technology that enables the evolution
to 5G by bringing cloud capabilities near to the end users (or user
equipment, UEs) in order to overcome the intrinsic problems of the
traditional cloud, such as high latency etc.
• Edge computing is preferred to cater forNPTEL
the wireless communication
requirements of next generation applications, such as augmented
reality and virtual reality, which are interactive in nature.
– These highly interactive applications are computationally‐intensive and
have high quality of service (QoS) requirements, including low latency
and high throughput.
– Further, these applications are expected to generate a massive amount
Edge Computing in 5G
• 5G is expected to cater following needs of today’s network
traffic
– Handle massive amount of data is generated by mobile devices/
IoTs
– Stringent QoS requirements are imposed to support highly
interactive applications, requiring ultra‐low latency and high
throughput
– Heterogeneous environment must NPTEL
be supported to allow inter‐
operability of a diverse range of end‐user equipment, QoS
requirements, network types etc.
Edge Computing in 5G ‐ Applications
• Healthcare
• Entertainment and multimedia applications
• Virtual reality, augmented reality, and mixed reality
• Tactile internet
• Internet of Things
• Factories of the future NPTEL
• Emergency response
• Intelligent Transportation System
5G and Mobile Cloud Computing (MCC)
• MCC is a cloud computing system including mobile devices and
delivering applications to the mobile devices.
• Key features of MCC for 5G networks include sharing resources for
mobile applications and improved reliability as data is backed up and
stored in the cloud.
• As data processing is offloaded by MCC from the devices to the cloud,
NPTEL
fewer device resources are consumed by applications.
• Compute‐intensive processing of mobile users’ requests is off‐loaded
from mobile networks to the cloud. Mobile devices are connected to
mobile networks via base stations (e.g., base transceiver station,
access point, or satellite).
Mobile Cloud Computing (MCC)
NPTEL
Qualcomm: https://www.qualcomm.com/5g/what‐is‐5g
Ericsson: https://www.ericsson.com/en/blog/2021/2/5g‐and‐cloud
N. Hassan, K. A. Yau and C. Wu, "Edge Computing in 5G: A Review," in
IEEE Access, vol. 7, pp. 127276‐127289, 2019, doi:
10.1109/ACCESS.2019.2938534
Setting the Scene for 5G: Opportunities & Challenges. International
Telecommunication Union, 2018 NPTEL
Securing 4G, 5G and Beyond with Fortinet:
https://www.fortinet.com/solutions/mobile‐carrier.html
How 5G Transforms Cloud Computing, Dell Technologies,
https://education.dellemc.com/content/dam/dell‐emc/documents/en‐
us/2020KS_Gloukhovtsev_How_5G_Transforms_Cloud_Computing.pdf
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 12: Cloud Computing Paradigms
Lecture 57: CPS and Cloud Computing
Cyber Physical System (CPS)
CPS and Cloud Computing
NPTEL
Cyber Physical System (CPS)
NPTEL
CPS and Cloud Computing
NPTEL
Cyber‐Physical System (CPS)
• A cyber‐physical system (CPS) is an orchestration of computers and physical
systems. Embedded computers monitor and control physical processes,
usually with feedback loops, where physical processes affect computations
and vice versa.
• The term “cyber‐physical systems” emerged around 2006, when it was
coined by Helen Gill at the National Science Foundation , USA
• CPS is about the intersection, not the union, of the physical and the cyber. It
combines engineering models and methods NPTEL from mechanical,
environmental, civil, electrical, biomedical, chemical, aeronautical and
industrial engineering with the models and methods of computer science.
• Applications of CPS include automotive systems, manufacturing, medical
devices, military systems, assisted living, traffic control and safety, process
control, power generation and distribution, energy conservation etc.
Cyber‐Physical System (CPS)
NPTEL
Cyber‐Physical System (CPS)
• CPS describes a broad range of complex, multi‐disciplinary, physically‐
aware next generation engineered system that integrates embedded
computing technologies (cyber part) into the physical world.
• In cyber‐physical systems, physical and software components are
deeply intertwined, able to operate on different spatial and temporal
scales, exhibit multiple and distinct behavioral modalities, and interact
with each other in ways that change with context.
•
NPTEL
CPS involves transdisciplinary approaches, merging theory of
cybernetics, mechatronics, design and process science.
• Cyber + Physical + Computation + Dynamics + Communication +
Security + Safety
Cyber‐Physical System (CPS)
• Cyber physical systems (CPS) are an emerging discipline that involves engineered
computing and communicating systems interfacing the physical world.
• Ongoing advances in science and engineering improve the tie between computational
and physical elements by means of intelligent mechanisms, increasing the
adaptability, autonomy, efficiency, functionality, reliability, safety, and usability of
cyber‐physical systems.
• Potential applications of cyber‐physical systems are in several areas, including:
intervention (e.g., collision avoidance); precision (e.g., robotic surgery and nano‐level
NPTELenvironments (e.g., search
manufacturing); operation in dangerous or inaccessible
and rescue, firefighting, and deep‐sea exploration); coordination (e.g., air traffic
control, war fighting); efficiency (e.g., zero‐net energy buildings); and augmentation
of human capabilities (e.g. in healthcare monitoring and delivery).
• Typical examples of CPS include : smart grid, autonomous automobile systems,
medical monitoring, industrial control systems, robotics systems, and automatic pilot
avionics.
Cyber‐Physical System (CPS)
• The interlinked networks of sensors, actuators and processing devices
create a vast network of connected computing resources, things and
humans.
• A CPS is the “integration of computation with physical processes” and
uses sensors and actuators to link the computational systems to the
physical world.
• CPS can be viewed as “computing as a physical act” where the real
NPTEL
world is monitored through sensors that transfer sensing data into the
cyberspace where cyber applications and services use the data to
affect the physical environment
• Cloud Computing Services provide a flexible platform for realizing the
goals of CPS
Cyber‐Physical System (CPS)
NPTEL
Cyber‐Physical System (CPS)
• The interlinked networks of sensors, actuators and processing devices create a
vast network of connected computing resources, things and humans that we will
refer to as a Smart Networked Systems and Societies (SNSS).
• A CPS is the “integration of computation with physical processes” and uses
sensors and actuators to link the computational systems to the physical world.
• CPS can be viewed as “computing as a physical act” where the real world is
monitored through sensors that transfer sensing data into the cyberspace where
NPTEL
cyber applications and services use the data to affect the physical environment
• Cloud Computing Services provide a flexible platform for realizing the goals of
CPS
• A Cyber‐Physical Cloud Computing (CPCC) architectural framework is defined as
“a system environment that can rapidly build, modify and provision
cyber‐physical systems composed of a set of cloud computing based sensor,
processing, control, and data services.”
CPS and Cloud ‐ Cyber‐Physical Cloud Computing (CPCC)
• A Cyber‐Physical Cloud Computing (CPCC) architectural framework can be defined
as “a system environment that can rapidly build, modify and provision
cyber‐physical systems composed of a set of cloud computing based sensor,
processing, control, and data services.”
NPTEL
Ref: A Vision of Cyber‐Physical Cloud Computing for Smart Networked Systems, NIST Report NIST, USA, August 2013
CPCC Benefits
• Efficient use of resources
• Modular composition
• Rapid development and scalability
• Smart adaptation to environment at every scale
• Reliable and resilient architecture
NPTEL
CPS and Cloud
NPTEL
High level CPCC Scenario
NPTEL
A Cloud‐based CPS architecture for
Intelligent Monitoring of Machining Processes
NPTEL
Cloud‐Edge Computing Framework for CPS
NPTEL
Ref: X. Wang, L. T. Yang, X. Xie, J. Jin and M. J. Deen, "A Cloud‐Edge Computing Framework for Cyber‐Physical‐Social
Services," in IEEE Communications Magazine, vol. 55, no. 11, pp. 80‐85, Nov. 2017, doi: 10.1109/MCOM.2017.1700360
Cloud‐Edge Computing Framework for CPS
NPTEL
https://en.wikipedia.org/wiki/Cyber‐physical_system
A Vision of Cyber‐Physical Cloud Computing for Smart Networked Systems, NIST
Report NIST, USA, August 2013
Architecture of Cyber‐Physical Systems Based on Cloud, Shaojie Luo, Lichen
Zhang, Nannan Guo, Proceedings of IEEE 5th Intl Conference on Big Data Security
on Cloud (BigDataSecurity), 2019
Lee EA. The Past, Present and Future of Cyber‐Physical Systems: A Focus on
Models. Sensors. 2015; 15(3):4837‐4869.NPTELhttps://doi.org/10.3390/s150304837
X. Wang, L. T. Yang, X. Xie, J. Jin and M. J. Deen, "A Cloud‐Edge Computing
Framework for Cyber‐Physical‐Social Services," in IEEE Communications Magazine,
vol. 55, no. 11, pp. 80‐85, Nov. 2017, doi: 10.1109/MCOM.2017.1700360.
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 12: Cloud Computing Paradigms
Lecture 58: Case Study I (Spatial Cloud Computing)
Spatial Data
Spatial Cloud
Spatial Analysis on Cloud
NPTEL
Spatial Data
Spatial Cloud Computing
NPTEL
Spatial Analysis on Cloud
NPTEL
Spatial Data and Analysis
• Spatial (or Geospatial) data is information that describes objects, events or
other features with a location on or near the surface of the earth.
• Geospatial data typically combines location information (usually coordinates on
the earth) and attribute information (the characteristics of the object, event or
phenomena concerned) with temporal information (the time or life span at
which the location and attributes exist).
Whenever we look at a map, we inherently start turning that map into information
NPTEL
by analyzing its contents—finding patterns, assessing trends, or making decisions.
NPTEL
Spatial Analytics + Cloud Computing
Emergence of cloud computing provides a potential solution with an elastic, on‐demand
computing platform to integrate – observation systems, parameter extracting
algorithms, phenomena simulations, analytical visualization and decision support, and
to provide social impact and user feedback
Spatial cloud computing refers to the cloud computing paradigm that is driven
by geospatial sciences, and optimized by spatiotemporal principles for enabling
geospatial science discoveries and cloud computing within distributed
computing environment
Spatial Cloud
• It supports shared resource pooling which is useful for participating
organizations with common or shared goals
– Network, Servers, Apps, Services, Storages and Databases
• Choice of various deployment, service and business models to best
suit organization goals
• Managed services prevent data loss from frequent outages,
NPTEL
minimizing financial risks, while increasing efficiency
Spatial Cloud ‐ Advantages
• Easy to Use- Infrastructure deployment with click of mouse, API and
Network.
• Scalability- Infrastructure requirement is based on application, nothing
to purchase.
• Cost- Optimized as it is resource usage based
• Reliability- Based on Enterprise grade Hardware; can subscribe to
NPTEL
multiple clouds.
• Risk- Change instantly (even OS).
Spatial Cloud – Typical Architecture
Private and public
organization wants to
share their spatial data
• Different requirement of
geospatial data space and
network bandwidth
Easy access of spatial
services
NPTEL
GIS decisions are made
easier
• Integrate latest databases
• Merge disparate systems
• Exchange information
internally and externally
Spatial Cloud – Typical Architecture
NPTEL
Mobility Analytics
(Utilize Cloud platform for computation and storage)
NPTEL
A general framework of Trajectory Trace Mining for
Smart‐City Applications
NPTEL
A Trajectory Cloud for enabling Efficient Mobility Services
NPTEL
Traj‐Cloud for analyzing Urban Dynamics
• Mobility trace analysis has a significant role in mapping the urban
dynamics.
– This analysis helps in location‐based service‐provisioning and facilitates
an effective transportation resource planning.
• Key aspect of the intelligent transportation system (ITS) is efficient mobility
analytics to understand the movement behaviours of the people.
• Analysing mobility traces and providing location‐aware service is a
challenging task. NPTEL
An end‐to‐end cloud‐based framework may facilitate efficient location‐
based service provisioning.
It helps to minimize the service‐waiting time and service‐provisioning
time of location‐based services such as food delivery or medical
emergency
Traj‐Cloud for analyzing Urban Dynamics
NPTEL
Traj‐Cloud Services
Trajectory data Indexing Service (TS1):
• Input: GPS trajectory trace (G) and other semantic information, such as, geotagged
locations or road network
• Output: Spatio-temporal indices of input traces and storage of the information
• GCP Component: Google BigQuery and Cloud SQL storage.
NPTEL
Cloud‐Fog‐Edge Computing for
Internet of Health Things (IoHT)
NPTEL
Fog Computing
• Fog Computing takes the cloud closer to the data producing
sensor devices. Devices such as routers, servers, switches act
as fog nodes if their processing power is employed for data
processing and result generation.
• Use of Fog technology for real time applications
• Aim is to develop a Fog Based Healthcare model based on
data collected by IoT based healthNPTEL
sensor.
• Collected data will be processed at Edge devices to reduce
latency, network usage and overall cost incurred at the cloud.
• The performance to be evaluated using simulator tool as well
as actual hardware
Cloud‐Fog‐Edge‐IoT
NPTEL
Cloud‐Fog‐Edge Hierarchy
Cloud Limitations
• Latency
• Large volume of data being generated.
• Bandwidth requirement
NPTEL
customized body sensor.
Proposal of Dew
Architecture based Fog Inference
model to enhance
robustness
Note:
*Heart Attack Prediction algorithm has no medical/ clinical implication and has been used only for
demonstration purposes.
Simulation using iFogSim
<<Java Class>>
tryhealth
NPTEL
Hierarchical Network Topology Model
Cloud
(Level‐0)
Fog
(Level‐1)
Edge NPTEL
(Level‐2)
IoT
(Level‐3)
Anish Poonia, MTech Dissertation, IIT Kharagpur, Fog Computing For Internet of
Health Things, 2020
Anish Poonia, Shreya Ghosh, Akash Ghosh, Shubha Brata Nath, Soumya K. Ghosh,
Rajkumar Buyya, CONFRONT: Cloud‐fog‐dew based monitoring framework for
COVID‐19 management, Internet of Things, Elsevier, Volume 16, 2021
Cisco White Paper. 2015. Fog Computing and the Internet of Things: Extend the
Cloud to Where the Things Are.
Gupta H, Vahid Dastjerdi A, Ghosh SK, Buyya R. iFogSim: A toolkit for modeling and
simulation of resource management techniques NPTEL in the Internet of Things, Edge and
Fog computing environments. Softw Pract Exper. 2017;47:1275‐296.
https://doi.org/10.1002/spe.2509
Luiz Bittencourt et al., The Internet of Things, Fog and Cloud continuum: Integration
and challenges, Internet of Things, Volumes 3–4, 2018, Pages 134‐155, ISSN 2542‐
6605, https://doi.org/10.1016/j.iot.2018.09.005
NPTEL
Cloud Computing
Prof. Soumya K Ghosh
Department of Computer Science
NPTEL
and Engineering
Module 12: Cloud Computing Paradigms
Lecture 60: Case Study II (Internet of Health Things)
(Part-B)
Cloud‐Fog‐Edge‐IoT Framework
Internet of Health Things (IoHT)
Case Study on Cloud‐Fog‐Edge‐IoHT
NPTEL
Internet of Health Things (IoHT)
NPTEL
Cloud‐Fog‐Edge Computing for
Internet of Health Things (IoHT)
NPTEL
Cloud‐Fog‐Edge‐IoHT
Objectives
• To design a Fog‐Edge Computing based health model to
reduce latency, network usage and cost incurred at the cloud.
• To test the designed fog model using iFogSim simulator.
• To develop a customized wearable device for collection of
health parameters. NPTEL
• To implement the proposed model over hardware and test its
efficacy.
• To study dew based computing and study its efficacy in the
proposed health scenario
Overall Workflow
Designing & Implementing the Fog
Model followed by performance Results &
Conceptualization evaluation using iFogSim. Inferences of
and Modelling (Simulation required to test the simulation
efficacy of the suggested model)
NPTEL
customized body sensor.
Proposal of Dew
Architecture based Fog Inference
model to enhance
robustness
Note:
*Heart Attack Prediction algorithm has no medical/ clinical implication and has been used only for
demonstration purposes.
Simulation using iFogSim
<<Java Class>>
tryhealth
NPTEL
Hierarchical Network Topology Model
Cloud
(Level‐0)
Fog
(Level‐1)
Edge NPTEL
(Level‐2)
IoT
(Level‐3)
Cloud‐Fog‐Edge‐IoHT – Typical Configuration
Device Configuration
Device MIPS RAM Up Bw Down Bw Level Cost/ BusyPower Idle Power
(MB) (Kbps) (Kbps) MIPS (Watts) (Watts)
Cloud 44800 40000 100 10000 0 0.01 16*103 16*83.25
ISP 2800 4000 10000 10000 1 0 107.339 83.4333
AreaGW 2800 4000 10000 10000 2 0 107.339 83.4333
Mobile 350 1000 10000 270 3 0 87.53 mW 82.44 mW
NPTEL
Source Destination Latency
Latency
Body sensor Mobile 1
Mobile Area GW 2
Area GW ISP GW 2
ISP GW Cloud 100
Mobile Display 1
Cloud‐Fog‐Edge‐IoHT – Process Flow
Body
Sensor
NPTEL
Application Placement
Application Module Placement in Fog based Placement in Cloudbased
Model Model
Client Module Mobile (Edge) Mobile (Edge)
Data Filtering Module Area Gateway (Fog) Cloud
Data Processing Area Gateway (Fog) Cloud
Module
NPTEL
Event Handler Module Area Gateway (Fog) Cloud
Confirmatory Module Cloud Cloud
Simulation Configuration
Configuration No. of AreaGW Total No. of Users
1 1 4
2 2 8
3 4 16
4 8 32
5 16 64
NPTEL
Performance Evaluation ‐ Latency
Average Latency of Control Loop
5000
4000
3000
Fog
Delay (in
2000
1000
NPTEL
Cloud
0
Config 1 Config 2 Config 3 Config 4 Config 5
• Fog: Latency is fixed as the application modules which form part of the control
h
loop are located at Area Gatewayl itself l
• Cloud: Modules are located at the Cloud Datacenter
Performance Evaluation – Network Usage
1600
Network Usage
Network Usage (in KBytes)
1400
1200
Fog Cloud
1000
800
600
400
NPTEL
200
0
Config 1 Config 2 Config 3 Config 4 Config 5
Physical Topology Configurations
Fog: Network usage is very low as only for few positive cases, the Confirmatory module
residing on Cloud is accessed.
Cloud: Network usage is high as all modules are now on Cloud.
Performance Evaluation – Cost of Execution
Cost of Execution in Cloud
5000000
Cost of execution
• Fog: Only the resources on Cloud incur cost, other resources are owned by the
organization.
• Cloud: More processing at Cloud leads to higher costs in case of Cloud based
architecture.
Performance Evaluation – Energy Consumption
3.00E+004
Energy Consumption
Energy Consumption (in kJ)
2.50E+004
Cloud Energy
2.00E+004 Mobile Energy
1.50E+004
1.00E+004 NPTEL
5.00E+003
0.00E+000
Config 2 – Fog Config 2 – Cloud Config 3 – Fog Config 3 – Cloud Config 4 – Fog Config 4 – Cloud
• Energy consumption at Mobile devices remains same in Fog as well as Cloud as the load does
not change.
• Energy requirement at the fog devices and Datacenter changes as the configuration changes
from Fog based to Cloud based architecture owing to shifting of Application modules.
Hardware Implementation
• Simulated model’s hardware
implementation done using :
– Customized body sensor
– Simulated sensor data
– Raspberry Pi as Fog Devices
– AWS as Cloud
NPTEL
Hardware Implementation
• Customized BP and
• NodeMCU ESP8266 CP2102 Board
Pulsemeter
Device has been customized to Arduino like Hardware IO with on board Wifi Chip
output serial data at 9600 baud
rate in ASCII format.
NPTEL
• Accelerometer (ADXL345) • Raspberry Pi 3 (Fog Device)
64 bit system with 1 GB RAM, wifi and
Each value has three components: X‐axis,
Y‐axis and Z‐axis Bluetooth connectivity.
Activity Detection using Accelerometer
Data
Extraction • Data Extraction ‐ The collected data
has three components: x‐axis, y‐axis,
z‐axis.
5 Point A = sqrt(x*x + y*y + z*z)
Smoothing
• 5‐Point smoothing ‐ To reduce any
NPTEL
induced noise, each signal is obtained
Feature
as an average of five signals; two
Extraction
preceding signals, the signal itself
and two succeeding signals
KNN
Classifier
Activity Detection using Accelerometer
Data • Feature Extraction: Following features
Extraction were extracted from the filtered signal
Maximum Amplitude
– Minimum Amplitude
– Mean Amplitude
5 Point – Standard Deviation in Amplitude
Smoothing – Energy in Time Domain
– Energy in Frequency Domain
•
NPTEL
K‐Nearest Neighbour Classifier
Feature
Extraction Feature values are normalized:
Y = (x‐min)/(max‐min)
K=3 (based on 5‐fold cross validation using
GridSearchCV lib) is used for classification
KNN
Classifier
Case Study: Cardiac Attack Prediction
Cardiac Attack Prediction Logic Health Parameter Alarm Value
NPTEL
Dew based Cloud‐Fog‐Edge‐IoHT ‐ Workflow
Sensor
Start collects Fog Processing Cloud
Data Processing
Is Cloud YES
NO
connectivity
available ?
Is Dew
Synch with
connectivity
Cloud
available ?
YES NPTEL
Is Dew –Cloud YES
NO connectivity
NO available ?
Dew based
Processing
Wait Display
Result
Stop
Comparative Study
FEATURE CLOUD Cloud‐FOG‐Edge With DEW
(from health service provider
perspective)
On‐premise resource utilization Low Sub‐optimal Optimal