0% found this document useful (0 votes)
70 views52 pages

Cloud Architecture and Services Overview

The document outlines the layered architecture of cloud computing, detailing Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) along with their respective roles and challenges. It also discusses the NIST Cloud Computing Reference Architecture, identifying key actors such as cloud consumers, providers, brokers, carriers, and auditors, and their interactions. Additionally, it emphasizes the importance of Quality of Service (QoS) parameters and market-oriented resource management in optimizing cloud services.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views52 pages

Cloud Architecture and Services Overview

The document outlines the layered architecture of cloud computing, detailing Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) along with their respective roles and challenges. It also discusses the NIST Cloud Computing Reference Architecture, identifying key actors such as cloud consumers, providers, brokers, carriers, and auditors, and their interactions. Additionally, it emphasizes the importance of Quality of Service (QoS) parameters and market-oriented resource management in optimizing cloud services.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT III

CLOUD ARCHITECTURE, SERVICES AND STORAGE

Layered Cloud Architecture Design - NIST Cloud Computing Reference


Architecture - Public, Private and Hybrid Clouds - laaS - PaaS - SaaS -
Architectural Design Challenges - Cloud Storage - Storage-as-a-Service -
Advantages of Cloud Storage - Cloud Storage Providers - S3.

3.1 LAYERED CLOUD ARCHITECTURE DESIGN


The architecture of a cloud is developed at three layers:

• Infrastructure,

• Platform

• Application
• These three development layers are implemented with virtualization and
standardization of hardware and software resources provisioned in the
cloud. The services to public, private, and hybrid clouds are conveyed to
users through networking support over the Internet and intranets involved.
• It is clear that the infrastructure layer is deployed first to support IaaS
services. This infrastructure layer serves as the foundation for building the
platform layer of the cloud for supporting PaaS services. In turn, the
platform layer is a foundation for implementing the application layer for
SaaS applications.
• Different types of cloud services demand application of these resources
separately. The infrastructure layer is built with virtualized compute,
storage, and network resources.
• The platform layer is for general-purpose and repeated usage of the
collection of software resources. This layer provides users with an
environment to develop their applications, to test operation flows, and to
monitor execution results and performance. The platform should be able
to assure users that they have scalability, dependability, and security
protection.
• In a way, the virtualized cloud platform serves as a “system middleware”
between the infrastructure and application layers of the cloud.

FIGURE: Layered architectural development of the cloud platform


for IaaS, PaaS, and SaaS applications over the Internet.
• The application layer is formed with a collection of all needed software
modules for SaaS applications. Service applications in this layer include
daily office management work, such as information retrieval, document
processing, and calendar and authentication services.
• The application layer is also heavily used by enterprises in business
marketing and sales, consumer relationship management (CRM), financial
transactions, and supply chain management.
• In general, SaaS demands the most work from the provider, PaaS is in the
middle, and IaaS demands the least.
• For example, Amazon EC2 provides not only virtualized CPU resources
to users, but also management of these provisioned resources. Services at
the application layer demand more work from providers. The best example
of this is the Salesforce.com CRM service, in which the provider supplies
not only the hardware at the bottom layer and the software at the top layer,
but also the platform and software tools for user application development
and monitoring.
3.1.1 Market-Oriented Cloud Architecture
• Cloud providers consider and meet the different QoS parameters of each
individual consumer as negotiated in specific SLAs. To achieve this, the
providers cannot deploy traditional system-centric resource management
architecture. Instead, market-oriented resource management is necessary
to regulate the supply and demand of cloud resources to achieve market
equilibrium between supply and demand.
• The designer needs to provide feedback on economic incentives for both
consumers and providers. The purpose is to promote QoS-based resource
allocation mechanisms. In addition, clients can benefit from the potential
cost reduction of providers, which could lead to a more competitive
market, and thus lower prices.
• Figure shows the high-level architecture for supporting market-oriented
resource allocation in a cloud computing environment.
• This cloud is basically built with the following entities: Users or brokers
acting on user’s behalf submit service requests from anywhere in the world
to the data center and cloud to be processed. The SLA resource allocator
acts as the interface between the data center/cloud service provider and
external users/brokers. It requires the interaction of the following
mechanisms to support SLA-oriented resource management. When a
service request is first submitted the service request examiner interprets
the submitted request for QoS requirements before determining
whether to accept or reject the request.
• The request examiner ensures that there is no overloading of resources
whereby many service requests cannot be fulfilled successfully due to
limited resources. It also needs the latest status information regarding
resource availability (from the VM Monitor mechanism) and workload
processing (from the Service Request Monitor mechanism) in order to
make resource allocation decisions effectively.
• Then it assigns requests to VMs and determines resource entitlements for
allocated VMs. The Pricing mechanism decides how service requests are
charged.
• For instance, requests can be charged based on submission time (peak/off-
peak), pricing rates (fixed/changing), or availability of resources (supply/
demand). Pricing serves as a basis for managing the supply and demand of
computing resources within the data center and facilitates in prioritizing
resource allocations effectively.

F IGURE: Market-oriented cloud architecture to expand/shrink


leasing of resources with variation in QoS/demand from users.
• The Accounting mechanism maintains the actual usage of resources by
requests so that the final cost can be computed and charged to users. In
addition, the maintained historical usage information can be utilized by the
Service Request Examiner and Admission Control mechanism to improve
resource allocation decisions.
• The VM Monitor mechanism keeps track of the availability of VMs and
their resource entitlements. The Dispatcher mechanism starts the
execution of accepted service requests on allocated VMs. The Service
Request Monitor mechanism keeps track of the execution progress of
service requests.
• Multiple VMs can be started and stopped on demand on a single physical
machine to meet accepted service requests, hence providing maximum
flexibility to configure various partitions of resources on the same physical
machine to different specific requirements of service requests.
• In addition, multiple VMs can concurrently run applications based on
different operating system environments on a single physical machine since
the VMs are isolated from one another on the same physical machine.
3.1.2 Quality of Service Factors
• The data center comprises multiple computing servers that provide
resources to meet service demands. In the case of a cloud as a commercial
offering to enable crucial business operations of companies, there are
critical QoS parameters to consider in a service request, such as time, cost,
reliability, and trust/security.
• In short, there should be greater importance on customers since they pay
to access services in clouds. In addition, the state of the art in cloud
computing has no or limited support for dynamic negotiation of SLAs
between participants and mechanisms for automatic allocation of resources
to multiple competing requests. Negotiation mechanisms are needed to
respond to alternate offers protocol for establishing SLAs.
• Commercial cloud offerings must be able to support customer-driven
service management based on customer profiles and requested service
requirements. Commercial clouds define computational risk management
tactics to identify, assess, and manage risks involved in the execution of
applications with regard to service requirements and customer needs.
• The cloud also derives appropriate market-based resource management
strategies that encompass both customer-driven service management and
computational risk management to sustain SLA-oriented resource
allocation.
• The system incorporates autonomic resource management models that
effectively self-manage changes in service requirements to satisfy both new
service demands and existing service obligations, and leverage VM
technology to dynamically assign resource shares according to service
requirements.
3.2 NIST CLOUD COMPUTING REFERENCE ARCHITECTURE
3.2.1 The Conceptual Reference Model
The National Institute of Standards and Technologies (NIST) is releasing
its first guidelines for agencies that want to use cloud computing in the
second half of 2009 National Institute of Standards and Technologies
Figure presents an overview of the NIST cloud computing reference
architecture, which identifies the major actors, their activities and
functions in cloud computing.
The diagram depicts a generic high-level architecture and is intended to
facilitate the understanding of the requirements, uses, characteristics and
standards of cloud computing.

The Conceptual Reference Model Figure 1 presents an overview of the


NIST cloud computing reference architecture, which identifies the major
actors, their activities and functions in cloud computing. The diagram
depicts a generic high-level architecture and is intended to facilitate the
understanding of the requirements, uses, characteristics and standards of
cloud computing.
Figure: The Conceptual Reference Model

As shown in Figure, the NIST cloud computing reference architecture


defines five major actors: cloud consumer, cloud provider, cloud carrier,
cloud auditor and cloud broker. Each actor is an entity (a person or an
organization) that participates in a transaction or process and/or performs
tasks in cloud computing.
Table briefly lists the actors defined in the NIST cloud computing
reference architecture.
Actors in Cloud Computing

Actor Definition
Cloud Consumer A person or organization that maintains a business
relationship with, and uses service from, Cloud Providers.
Cloud Provider A person, organization, or entity responsible for making a
service available to interested parties.
Cloud Auditor A party that can conduct independent assessment of cloud
services, information system operations, performance and
security of the cloud implementation.
Cloud Broker An entity that manages the use, performance and delivery of
cloud services, and negotiates relationships between Cloud
Providers and Cloud Consumers.
Cloud Carrier An intermediary that provides connectivity and transport
of cloud services from Cloud Providers to Cloud
Consumers.
Figure illustrates the interactions among the actors. A cloud consumer may
request cloud services from a cloud provider directly or via a cloud broker.
A cloud auditor conducts independent audits and may contact the others
to collect necessary information.

Cloud Consumer Cloud auditor

Cloud Broker Cloud Provider

The communication path between a cloud provider and cloud consumer


The communication paths for a cloud auditor to collect auditing information
The communication paths for a cloud broker to privide service to a cloud consumer

Figure : Interactions between the Actors in Cloud Computing


Example Usage Scenario 1:
A cloud consumer may request service from a cloud broker instead of
contacting a cloud provider directly. The cloud broker may create a new
service by combining multiple services or by enhancing an existing service.
In this example, the actual cloud providers are invisible to the cloud
consumer and the cloud consumer interacts directly with the cloud broker.
Figure: Usage Scenario for Cloud Brokers
Example Usage Scenario 2:
Cloud carriers provide the connectivity and transport of cloud
services from cloud providers to cloud consumers. As illustrated in Figure
3.6, a cloud provider participates in and arranges for two unique service
level agreements (SLAs), one with a cloud carrier (e.g. SLA2) and one with
a cloud consumer (e.g. SLA1). A cloud provider arranges service level
agreements (SLAs) with a cloud carrier and may request dedicated and
encrypted connections to ensure the cloud services are consumed at a
consistent level according to the contractual obligations with the cloud
consumers. In this case, the provider may specify its requirements on
capability, flexibility and functionality in SLA2 in order to provide essential
requirements in SLA1.

Figure : Usage Scenario for Cloud Carriers


Example Usage Scenario 3:

For a cloud service, a cloud auditor conducts independent assessments of


the operation and security of the cloud service implementation. The audit
may involve interactions with both the Cloud Consumer and the Cloud
Provider.
Figure: Usage Scenario for Cloud Auditors
3.2.2 Cloud Consumer
The cloud consumer is the principal stakeholder for the cloud computing
service. A cloud consumer represents a person or organization that
maintains a business relationship with, and uses the service from a cloud
provider.
A cloud consumer browses the service catalog from a cloud provider,
requests the appropriate service, sets up service contracts with the cloud
provider, and uses the service. The cloud consumer may be billed for the
service provisioned, and needs to arrange payments accordingly.
Cloud consumers need SLAs to specify the technical performance
requirements fulfilled by a cloud provider. SLAs can cover terms regarding
the quality of service, security, remedies for performance failures. A cloud
provider may also list in the SLAs a set of promises explicitly not made to
consumers, i.e. limitations, and obligations that cloud consumers must
accept.
A cloud consumer can freely choose a cloud provider with better pricing
and more favorable terms. Typically a cloud provider s pricing policy and
SLAs are non-negotiable, unless the customer expects heavy usage and
might be able to negotiate for better contracts.
Depending on the services requested, the activities and usage scenarios can
be different among cloud consumers.
Figure presents some example cloud services available to a cloud
consumer SaaS applications in the cloud and made accessible via a network
to the SaaS consumers.
The consumers of SaaS can be organizations that provide their members
with access to software applications, end users who directly use software
applications, or software application administrators who configure
applications for end users. SaaS consumers can be billed based on the
number of end users, the time of use, the network bandwidth consumed, the
amount of data stored or duration of stored data.
Figure: Example Services Available to a Cloud Consumer
Cloud consumers of PaaS can employ the tools and execution resources
provided by cloud providers to develop, test, deploy and manage the
applications hosted in a cloud environment.
PaaS consumers can be application developers who design and implement
application software, application testers who run and test applications in
cloud-based environments, application deployers who publish applications
into the cloud, and application administrators who configure and monitor
application performance on a platform.
PaaS consumers can be billed according to, processing, database storage
and network resources consumed by the PaaS application, and the duration
of the platform usage.
Consumers of IaaS have access to virtual computers, network-accessible
storage, network infrastructure components, and other fundamental
computing resources on which they can deploy and run arbitrary software.
The consumers of IaaS can be system developers, system administrators
and IT managers who are interested in creating, installing, managing and
monitoring services for IT infrastructure operations. IaaS consumers are
provisioned with the capabilities to access these computing resources, and
are billed according to the amount or duration of the resources consumed,
such as CPU hours used by virtual computers, volume and duration of data
stored, network bandwidth consumed, number of IP addresses used for
certain intervals.
3.2.3 Cloud Provider
A cloud provider is a person, an organization; it is the entity responsible
for making a service available to interested parties.
A Cloud Provider acquires and manages the computing infrastructure
required for providing the services, runs the cloud software that provides
the services, and makes arrangement to deliver the cloud services to the
Cloud Consumers through network access.
For Software as a Service, the cloud provider deploys, configures,
maintains and updates the operation of the software applications on a
cloud infrastructure so that the services are provisioned at the expected
service levels to cloud consumers.
The provider of SaaS assumes most of the responsibilities in managing and
controlling the applications and the infrastructure, while the cloud
consumers have limited administrative control of the applications.
For PaaS, the Cloud Provider manages the computing infrastructure for
the platform and runs the cloud software that provides the components of
the platform, such as runtime software execution stack, databases, and
other middleware components.
The PaaS Cloud Provider typically also supports the development,
deployment and management process of the PaaS Cloud Consumer by
providing tools such as integrated development environments (IDEs),
development version of cloud software, software development kits
(SDKs), deployment and management tools.
The PaaS Cloud Consumer has control over the applications and possibly
some the hosting environment settings, but has no or limited access to the
infrastructure underlying the platform such as network, servers, operating
systems (OS), or storage.
For IaaS, the Cloud Provider acquires the physical computing resources
underlying the service, including the servers, networks, storage and
hosting infrastructure.
The Cloud Provider runs the cloud software necessary to makes computing
resources available to the IaaS Cloud Consumer through a set of service
interfaces and computing resource abstractions, such as virtual machines
and virtual network interfaces.
The IaaS Cloud Consumer in turn uses these computing resources, such as
a virtual computer, for their fundamental computing needs Compared to
SaaS and PaaS Cloud Consumers, an IaaS Cloud Consumer has access to
more fundamental forms of computing resources and thus has more
control over the more software components in an application stack,
including the OS and network.
The IaaS Cloud Provider, on the other hand, has control over the physical
hardware and cloud software that makes the provisioning of these
infrastructure services possible, for example, the physical servers, network
equipments, storage devices, host OS and hypervisors for virtualization.
A Cloud Provider s activities can be described in five major areas, as shown
in Figure 3.9, a cloud provider conducts its activities in the areas of service
deployment, service orchestration, cloud service management, security,
and privacy.

Orchestration

Figure: Cloud Provider - Major Activities


3.2.4 Cloud Auditor
A cloud auditor is a party that can perform an independent examination of
cloud service controls with the intent to express an opinion thereon.
Audits are performed to verify conformance to standards through review
of objective evidence. A cloud auditor can evaluate the services provided
by a cloud provider in terms of security controls, privacy impact,
performance, etc.
Auditing is especially important for federal agencies as “agencies should
include a contractual clause enabling third parties to assess security
controls of cloud providers” (by Vivek Kundra, Federal Cloud Computing
Strategy, Feb. 2011.).
Security controls are the management, operational, and technical
safeguards or countermeasures employed within an organizational
information system to protect the confidentiality, integrity, and availability
of the system and its information.
For security auditing, a cloud auditor can make an assessment of the
security controls in the information system to determine the extent to
which the controls are implemented correctly, operating as intended, and
producing the desired outcome with respect to the security requirements for the
system.
The security auditing should also include the verification of the compliance
with regulation and security policy. For example, an auditor can be tasked
with ensuring that the correct policies are applied to data retention
according to relevant rules for the jurisdiction. The auditor may ensure that
fixed content has not been modified and that the legal and business data
archival requirements have been satisfied.
A privacy impact audit can help Federal agencies comply with applicable
privacy laws and regulations governing an individual s privacy, and to
ensure confidentiality, integrity, and availability of an individual s
personal information at every stage of development and operation.
3.2.5 Cloud Broker
A cloud consumer may request cloud services from a cloud broker, instead
of contacting a cloud provider directly. A cloud broker is an entity that
manages the use, performance and delivery of cloud services and
negotiates relationships between cloud providers and cloud consumers.
In general, a cloud broker can provide services in three categories:
Service Intermediation: A cloud broker enhances a given service by
improving some specific capability and providing value-added services to
cloud consumers. The improvement can be managing access to cloud
services, identity management, performance reporting, enhanced
security, etc.
Service Aggregation: A cloud broker combines and integrates multiple
services into one or more new services. The broker provides data
integration and ensures the secure data movement between the cloud
consumer and multiple cloud providers.
Service Arbitrage: Service arbitrage is similar to service aggregation
except that the services being aggregated are not fixed. Service arbitrage
means a broker has the flexibility to choose services from multiple
agencies. The cloud broker, for example, can use a credit-scoring service to
measure and select an agency with the best score.
3.2.6 Cloud Carrier
A cloud carrier acts as an intermediary that provides connectivity and
transport of cloud services between cloud consumers and cloud providers.
Cloud carriers provide access to consumers through network,
telecommunication and other access devices.
For example, cloud consumers can obtain cloud services through network
access devices, such as computers, laptops, mobile phones, mobile Internet
devices (MIDs), etc.
The distribution of cloud services is normally provided by network and
telecommunication carriers or a transport agent , where a transport agent
refers to a business organization that provides physical transport of storage
media such as high-capacity hard drives.
Note that a cloud provider will set up SLAs with a cloud carrier to provide
services consistent with the level of SLAs offered to cloud consumers, and
may require the cloud carrier to provide dedicated and secure connections
between cloud consumers and cloud providers.

3.2.7 Scope of Control between Provider and Consumer


The Cloud Provider and Cloud Consumer share the control of resources in
a cloud system. As illustrated in Figure, different service models affect an
organization s control over the computational resources and thus what can
be done in a cloud system.
The figure shows these differences using a classic software stack notation
comprised of the application, middleware, and OS layers.
This analysis of delineation of controls over the application stack helps
understand the responsibilities of parties involved in managing the cloud
application.

SaaS
SaaS

PaaS
PaaS

IaaS

IaaS
Figure: Scope of Controls between Provider and Consumer
The application layer includes software applications targeted at end users
or programs. The applications are used by SaaS consumers, or installed/
managed/ maintained by PaaS consumers, IaaS consumers, and SaaS
providers.
The middleware layer provides software building blocks (e.g., libraries,
database, and Java virtual machine) for developing application software in
the cloud. The middleware is used by PaaS consumers, installed/managed/
maintained by IaaS consumers or PaaS providers, and hidden from SaaS
consumers.
The OS layer includes operating system and drivers, and is hidden from
SaaS consumers and PaaS consumers.
An IaaS cloud allows one or multiple guest OS s to run virtualized on a
single physical host.
Generally, consumers have broad freedom to choose which OS to be
hosted among all the OS s that could be supported by the cloud provider.
The IaaS consumers should assume full responsibility for the guest OS s,
while the IaaS provider controls the host OS.

CLOUD COMPUTING REFERENCE ARCHITECTURE:


ARCHITECTURAL COMPONENTS
3.3.1 Service Deployment
• A cloud infrastructure may be operated in one of the following deployment
models: public cloud, private cloud, community cloud, or hybrid cloud.
The differences are based on how exclusive the computing resources are
made to a Cloud Consumer.
Public Cloud
• A public cloud is built over the Internet and can be accessed by any user
who has paid for the service. Public clouds are owned by service providers
and are accessible through a subscription.
• The callout box in top of Figure shows the architecture of a typical public
cloud.
FIGURE Public, private, and hybrid clouds illustrated by functional
architecture and connectivity of representative clouds available by
2011.
• Many public clouds are available, including Google App Engine (GAE),
Amazon Web Services (AWS), Microsoft Azure, IBM Blue Cloud, and
Salesforce.com’s Force.com. The providers of the aforementioned clouds
are commercial providers that offer a publicly accessible remote interface for
creating and managing VM instances within their proprietary infrastructure.

• A public cloud delivers a selected set of business processes.


• The application and infrastructure services are offered on a flexible price-
peruse basis.
• A public cloud is one in which the cloud infrastructure and computing
resources are made available to the general public over a public network.
The NIST definition “Cloud computing”
It is a model for enabling ubiquitous, convenient, on-demand network
access to a shared pool of configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can be rapidly provisioned
and released with minimal management effort or service provider
interaction.”
• The NIST definition also identifies
o 5 essential characteristics
o 3 service models
o 4 deployment models
4 deployment models / Types of cloud
• Public: Accessible, via the Internet, to anyone who pays
– Owned by service providers; e.g., Google App Engine, Amazon Web
Services, Force.com.
A public cloud is a publicly accessible cloud environment owned by a third-
party cloud provider. The IT resources on public clouds are usually
provisioned via the previously described cloud delivery models and are
generally offered to cloud consumers at a cost or are commercialized via
other avenues (such as advertisement).
The cloud provider is responsible for the creation and on-going maintenance
of the public cloud and its IT resources. Many of the scenarios and
architectures explored in upcoming chapters involve public clouds and the
relationship between the providers and consumers of IT resources via public
clouds.
Figure 1 shows a partial view of the public cloud landscape, highlighting
some of the primary vendors in the marketplace.
• Community: Shared by two or more organizations with joint interests,
such as colleges within a university
A community cloud is similar to a public cloud except that its access is
limited to a specific community of cloud consumers. The community cloud
may be jointly owned by the community members or by a third-party cloud
provider that provisions a public cloud with limited access. The member
cloud consumers of the community typically share the responsibility for
defining and evolving the community cloud (Figure 1).
Membership in the community does not necessarily guarantee access to or
control of all the cloud's IT resources. Parties outside the community are
generally not granted access unless allowed by the community.
An example of a "community" of organizations accessing IT resources from
a community cloud.
• Private: Accessible via an intranet to the members of the owning
organization
– Can be built using open source software such as CloudStack or OpenStack
– Example of private cloud: NASA’s cloud for climate modeling
A private cloud is owned by a single organization. Private clouds enable an
organization to use cloud computing technology as a means of centralizing
access to IT resources by different parts, locations, or departments of the
organization. When a private cloud exists as a controlled environment, the
problems described in the Risks and Challenges section do not tend to apply.
The use of a private cloud can change how organizational and trust
boundaries are defined and applied. The actual administration of a private
cloud environment may be carried out by internal or outsourced staff.

A cloud service consumer in the organization's on-premise environment


accesses a cloud service hosted on the same organization's private cloud
via a virtual private network.
With a private cloud, the same organization is technically both the cloud
consumer and cloud provider . In order to differentiate these roles:
▪ a separate organizational department typically assumes the responsibility
for provisioning the cloud (and therefore assumes the cloud provider role)
▪ departments requiring access to the private cloud assume the cloud
consumer role

Hybrid
A hybrid cloud is a cloud environment comprised of two or more different
cloud deployment models. For example, a cloud consumer may choose to
deploy cloud services processing sensitive data to a private cloud and other,
less sensitive cloud services to a public cloud. The result of this combination
is a hybrid deployment model.

An organization using a hybrid cloud architecture that utilizes both a


private and public cloud.
Hybrid deployment architectures can be complex and challenging to create
and maintain due to the potential disparity in cloud environments and the
fact that management responsibilities are typically split between the private
cloud provider organization and the public cloud provider.

Three service models / Categories of cloud computing.


• Software as a Service (SaaS)
– Use provider’s applications over a network
• Platform as a Service (PaaS)
– Deploy customer-created applications to a cloud
• Infrastructure as a Service (IaaS)

• Software as a Service (SaaS). The capability provided to the


consumer is to use the provider’s applications running on a cloud
infrastructure. The applications are accessible from various client
devices through a thin client interface such as a web browser (e.g.,
web-based email). The consumer does not manage or control the
underlying cloud infrastructure including network, servers,
operating systems, storage, or even individual application
capabilities, with the possible exception of provider-defined user-
specific application configuration settings.
• Platform as a Service (PaaS). The capability provided to the
consumer is to deploy onto the cloud infrastructure consumer-
created or acquired applications created using programming
languages and tools supported by the provider. The consumer does
not manage or control the underlying cloud infrastructure including
network, servers, operating systems, or storage, but has control over
the deployed applications and possibly application hosting
environment configurations.
• Infrastructure as a Service (IaaS). The capability provided to the
consumer is to provision processing, storage, networks, and other
fundamental computing resources where the consumer is able to
deploy and run arbitrary software, which can include operating
systems and applications. The consumer does not manage or control
the underlying cloud physical infrastructure but has control over
operating systems, storage, deployed applications, and possibly
limited control of select networking components.
Comparison of cloud service models

3.3.2 Service Orchestration


• Service Orchestration refers to the composition of system components to
support the Cloud Providers activities in arrangement, coordination and
management of computing resources in order to provide cloud services to
Cloud Consumers.
3.4 INFRASTRUCTURE-AS-A-SERVICE (IAAS)
• Cloud computing delivers infrastructure, platform, and software
(application) as services, which are made available as subscription-based
services in a pay-as-you-go model to consumers.
• The services provided over the cloud can be generally categorized into
three different models: namely IaaS, Platform as a Service (PaaS), and
Software as a Service (SaaS). All three models allow users to access
services over the Internet, relying entirely on the infrastructures of cloud
service providers.
• These models are offered based on various SLAs between providers and
users. In a broad sense, the SLA for cloud computing is addressed in terms
of service availability, performance, and data protection and security.
• Figure illustrates three cloud models at different service levels of the cloud.
SaaS is applied at the application end using special interfaces by users or
clients.

FIGURE: The IaaS, PaaS, and SaaS cloud service models at different
service levels.
• At the PaaS layer, the cloud platform must perform billing services and
handle job queuing, launching, and monitoring services. At the bottom
layer of the IaaS services, databases, compute instances, the file system,
and storage must be provisioned to satisfy user demands.
3.4.1 Infrastructure as a Service
• This model allows users to use virtualized IT resources for computing,
storage, and networking. In short, the service is performed by rented cloud
infrastructure. The user can deploy and run his applications over his chosen
OS environment.
• The user does not manage or control the underlying cloud infrastructure,
but has control over the OS, storage, deployed applications, and possibly
select networking components. This IaaS model encompasses storage as
a service, compute instances as a service, and communication as a
service.
• The Virtual Private Cloud (VPC) in Example shows how to provide Amazon
EC2 clusters and S3 storage to multiple users. Many startup cloud providers
have appeared in recent years. GoGrid, FlexiScale, and Aneka are good
examples. Table summarizes the IaaS offerings by five public cloud providers.
Interested readers can visit the companies’ web sites for updated
information.
Public Cloud Offerings of IaaS

Cloud NameVM Instance Capacity API and Hypervisor,


Access Tools Guest OS
Amazon Each instance has 1 –20 EC2 CLI or web Xen, Linux,
EC2 processors, 1.7–15 GB of memory, Service (WS) Windows
and 160-1.69 TB of storage. portal
GoGrid Each instance has 1–6 CPUs, 0.5–8 GB REST. Java. Xen, Linux,
of memory, and 30-480 GB of storage. PHP, Python, Windows
Ruby
Rackspace Each instance has a four-core CPU, 0.25–REST, Python, Xen, Linux
Cloud 16 GB of memory, and 10-620 GB PHP, Java, C#,
of storage. .NET
Flexi Scale Each instance has 1–4 CPUs, 0.5–16 GB of web console Xen, Linux,
in the UK memory, and 20-270 GB of storage. Windows
Joyent Cloud Each instance has up to eight CPUs, 0.25– No specific OS-level
32 GB of memory, and 30-480 GB API, SSH, virtualization,
of storage. Virtual/Min Open Solaris
Example
Amazon VPC for Multiple Tenants
• A user can use a private facility for basic computations. When he must meet
a specific workload requirement, he can use the Amazon VPC to provide
additional EC2 instances or more storage (S3) to handle urgent
applications.
• Figure shows VPC which is essentially a private cloud designed to address
the privacy concerns of public clouds that hamper their application when
sensitive data and software are involved.

FIGURE Amazon VPC (virtual private cloud) Courtesy of VMWare,


http://aws.amazon.com/vpc/
• Amazon EC2 provides the following services: resources from multiple
data centers globally distributed, CL1, web services (SOAP and Query),
web- based console user interfaces, access to VM instances via SSH and
Windows,
99.5 percent available agreements, per-hour pricing, Linux and Windows
OSes, and automatic scaling and load balancing.
• VPC allows the user to isolate provisioned AWS processors, memory, and
storage from interference by other users. Both autoscaling and elastic
load balancing services can support related demands. Autoscaling enables
users to automatically scale their VM instance capacity up or down. With
auto- scaling, one can ensure that a sufficient number of Amazon EC2
instances are provisioned to meet desired performance. Or one can scale down the
VM instance capacity to reduce costs, when the workload is reduced.
3.5 PLATFORM-AS-A-SERVICE (PAAS) AND SOFTWARE-AS- A-
SERVICE (SAAS)
SaaS is often built on top of the PaaS, which is in turn built on top of the
IaaS.
3.5.1 Platform as a Service (PaaS)
• To be able to develop, deploy, and manage the execution of applications
using provisioned resources demands a cloud platform with the proper
software environment. Such a platform includes operating system and
runtime library support.
• This has triggered the creation of the PaaS model to enable users to
develop and deploy their user applications. Table highlights cloud
platform services offered by five PaaS services.

Five Public Cloud Offerings of PaaS


Languages and Programming Models Target Applications
Cloud Name
Developer Tools Supported by Provider and Storage Option
Google App Python, Java, and MapReduce, web Web applications and
Engine Eclipse-based IDE programming on demand BigTable storage
Salesforce.com’s Apex, Eclipse-based Workflow, Excel-like Business applications
Force.com IDE, web-based Wizard formula, Web such as CRM
programming on demand
Microsoft Azure .NET, Azure tools for Unrestricted model Enterprise and web
MS Visual Studio applications
Amazon Elastic Hive, Pig, Cascading, MapReduce Data processing and e-
MapReduce Java, Ruby, Perl, commerce
Python, PHP, R, C++
Aneka .NET, stand-alone SDK Threads, task, MapReduce.NET enterprise
applications, HPC
• The platform cloud is an integrated computer system consisting of both
hardware and software infrastructure. The user application can be
developed on this virtualized cloud platform using some programming
languages and software tools supported by the provider (e.g., Java,
Python, .NET).
• The user does not manage the underlying cloud infrastructure. The cloud
provider supports user application development and testing on a well-
defined service platform. This PaaS model enables a collaborated
software development platform for users from different parts of the world.
This model also encourages third parties to provide software management,
integration, and service monitoring solutions.
Example
Google App Engine for PaaS Applications
As web applications are running on Google’s server clusters, they share
the same capability with many other users. The applications have features
such as automatic scaling and load balancing which are very convenient
while building web applications. The distributed scheduler mechanism can
also schedule tasks for triggering events at specified times and regular
intervals.
Figure shows the operational model for GAE. To develop applications using
GAE, a development environment must be provided.

Users

HTTP HTTP
request response

User
interface

Google load balance

Data Data Data Data


Data
Data Data Data

FIGURE Google App Engine platform for PaaS operations


Google provides a fully featured local development environment that
simulates GAE on the developer’s computer. All the functions and
application logic can be implemented locally which is quite similar to
traditional software development. The coding and debugging stages can be
performed locally as well. After these steps are finished, the SDK provided
provides a tool for uploading the user’s application to Google’s
infrastructure where the applications are actually deployed. Many additional
third-party capabilities, including software management, integration, and
service monitoring solutions, are also provided.
3.5.2 Software as a Service (SaaS)
• This refers to browser-initiated application software over thousands of
cloud customers. Services and tools offered by PaaS are utilized in
construction of applications and management of their deployment on
resources offered by IaaS providers. The SaaS model provides software
applications as a service.
• As a result, on the customer side, there is no upfront investment in servers
or software licensing. On the provider side, costs are kept rather low,
compared with conventional hosting of user applications.
• Customer data is stored in the cloud that is either vendor proprietary or
publicly hosted to support PaaS and IaaS. The best examples of SaaS
services include Google Gmail and docs, Microsoft SharePoint, and the
CRM software from Salesforce.com. They are all very successful in
promoting their own business or are used by thousands of small businesses
in their dayto-day operations.
• Providers such as Google and Microsoft offer integrated IaaS and PaaS
services, whereas others such as Amazon and GoGrid offer pure IaaS
services and expect third-party PaaS providers such as Manjrasoft to offer
application development and deployment services on top of their
infrastructure services. To identify important cloud applications in
enterprises, the success stories of three real-life cloud applications are
presented in Example 3.6 for HTC, news media, and business transactions.
The benefits of using cloud services are evident in these SaaS applications.
Example
Three Success Stories on SaaS Applications
1. To discover new drugs through DNA sequence analysis, Eli Lily Company
has used Amazon’s AWS platform with provisioned server and storage
clusters to conduct high-performance biological sequence analysis
without using an expensive supercomputer. The benefit of this IaaS
application is reduced drug deployment time with much lower costs.
2. The New York Times has applied Amazon’s EC2 and S3 services to retrieve
useful pictorial information quickly from millions of archival articles and
newspapers. The New York Times has significantly reduced the time and
cost in getting the job done.
3. Pitney Bowes, an e-commerce company, offers clients the opportunity to
perform B2B transactions using the Microsoft Azure platform, along with
.NET and SQL services. These offerings have significantly increased the
company’s client base.

3.5.3 Mashup of Cloud Services


• At the time of this writing, public clouds are in use by a growing number
of users. Due to the lack of trust in leaking sensitive data in the business
world, more and more enterprises, organizations, and communities are
developing private clouds that demand deep customization.
• An enterprise cloud is used by multiple users within an organization. Each
user may build some strategic applications on the cloud, and demands
customized partitioning of the data, logic, and database in the metadata
representation. More private clouds may appear in the future.
• Based on a 2010 Google search survey, interest in grid computing is
declining rapidly. Cloud mashups have resulted from the need to use
multiple clouds simultaneously or in sequence.
• For example, an industrial supply chain may involve the use of different
cloud resources or services at different stages of the chain. Some public
repository provides thousands of service APIs and mashups for web
commerce services. Popular APIs are provided by Google Maps, Twitter,
YouTube, Amazon eCommerce, Salesforce.com, etc.
3.6 ARCHITECTURAL DESIGN CHALLENGES
Six open challenges in cloud architecture development
1 Service AvailabilityandData Lock-inProblem
2 Data Privacy and Security Concerns
3 Unpredictable Performance and Bottlenecks
4 Distributed Storage and Widespread Software Bugs
5 Cloud Scalability, Interoperability, and Standardization
6 Software Licensing and Reputation Sharing
3.6.1 Challenge1—ServiceAvailabilityandData Lock-inProblem
• The management of a cloud service by a single company is often the source
of single points of failure. To achieve HA, one can consider using multiple
cloud providers. Even if a company has multiple data centers located in
different geographic regions, it may have common software infrastructure
and accounting systems. Therefore, using multiple cloud providers may
provide more protection from failures.
• Another availability obstacle is distributed denial of service (DDoS)
attacks. Criminals threaten to cut off the incomes of SaaS providers by
making their services unavailable. Some utility computing services offer
SaaS providers the opportunity to defend against DDoS attacks by using
quick scale-ups.
• Software stacks have improved interoperability among different cloud
platforms, but the APIs itself are still proprietary. Thus, customers cannot
easily extract their data and programs from one site to run on another.
• The obvious solution is to standardize the APIs so that a SaaS developer
can deploy services and data across multiple cloud providers. This will
rescue the loss of all data due to the failure of a single company.
• In addition to mitigating data lock-in concerns, standardization of APIs
enables a new usage model in which the same software infrastructure
can be used in both public and private clouds. Such an option could enable “surge
computing,” in which the public cloud is used to capture the extra tasks that cannot
be easily run in the data center of a private cloud.
3.6.2 Challenge 2—Data Privacy and Security Concerns
• Current cloud offerings are essentially public (rather than private)
networks, exposing the system to more attacks. Many obstacles can be
overcome immediately with well-understood technologies such as
encrypted storage, virtual LANs, and network middleboxes (e.g., firewalls,
packet filters).
• For example, you could encrypt your data before placing it in a cloud. Many
nations have laws requiring SaaS providers to keep customer data and
copyrighted material within national boundaries.
• Traditional network attacks include buffer overflows, DoS attacks,
spyware, malware, rootkits, Trojan horses, and worms. In a cloud
environment, newer attacks may result from hypervisor malware, guest
hopping and hijacking, or VM rootkits.
• Another type of attack is the man-in-the-middle attack for VM migrations.
In general, passive attacks steal sensitive data or passwords. Active attacks
may manipulate kernel data structures which will cause major damage to
cloud servers.
3.6.3 Challenge 3—Unpredictable Performance and Bottlenecks
• Multiple VMs can share CPUs and main memory in cloud computing, but
I/ O sharing is problematic. For example, to run 75 EC2 instances with the
STREAM benchmark requires a mean bandwidth of 1,355 MB/second.
However, for each of the 75 EC2 instances to write 1 GB files to the local
disk requires a mean disk write bandwidth of only 55 MB/second. This
demonstrates the problem of I/O interference between VMs. One solution
is to improve I/O architectures and operating systems to efficiently
virtualize interrupts and I/O channels.
• Internet applications continue to become more data-intensive. If we
assume applications to be “pulled apart” across the boundaries of clouds,
this may complicate data placement and transport. Cloud users and
providers have to think about the implications of placement and traffic at
every level of the system, if they want to minimize costs. This kind of
reasoning can be seen in Amazon’s development of its new CloudFront service.
Therefore, data transfer bottlenecks must be removed, bottleneck links must be
widened, and weak servers should be removed.
3.6.4 Challenge 4—Distributed Storage and Widespread Software Bugs
• The database is always growing in cloud applications. The opportunity is
to create a storage system that will not only meet this growth, but also
combine it with the cloud advantage of scaling arbitrarily up and down on
demand. This demands the design of efficient distributed SANs.
• Data centers must meet programmers’ expectations in terms of scalability,
data durability, and HA. Data consistence checking in SAN-connected data
centers is a major challenge in cloud computing.
• Large-scale distributed bugs cannot be reproduced, so the debugging must
occur at a scale in the production data centers. No data center will provide
such a convenience. One solution may be a reliance on using VMs in cloud
computing. The level of virtualization may make it possible to capture
valuable information in ways that are impossible without using VMs.
Debugging over simulators is another approach to attacking the problem,
if the simulator is well designed.
3.6.5 Challenge 5—Cloud Scalability, Interoperability, and Standardization
• The pay-as-you-go model applies to storage and network bandwidth; both
are counted in terms of the number of bytes used. Computation is different
depending on virtualization level. GAE automatically scales in response to
load increases and decreases; users are charged by the cycles used.
• AWS charges by the hour for the number of VM instances used, even if the
machine is idle. The opportunity here is to scale quickly up and down in
response to load variation, in order to save money, but without violating
SLAs.
• Open Virtualization Format (OVF) describes an open, secure, portable,
efficient, and extensible format for the packaging and distribution of VMs.
It also defines a format for distributing software to be deployed in VMs.
This VM format does not rely on the use of a specific host
platform, virtualization platform, or guest operating system. The approach is to
address virtual platform-agnostic packaging with certification and integrity of
packaged software. The package supports virtual appliances to span more than one
VM.

• OVF also defines a transport mechanism for VM templates, and can apply
to different virtualization platforms with different levels of virtualization.
In terms of cloud standardization, we suggest the ability for virtual
appliances to run on any virtual platform. We also need to enable VMs to
run on heterogeneous hardware platform hypervisors. This requires
hypervisor- agnostic VMs. We also need to realize cross-platform live
migration between x86 Intel and AMD technologies and support legacy
hardware for load balancing. All these issue are wide open for further
research.
3.6.6 Challenge 6—Software Licensing and Reputation Sharing
• Many cloud computing providers originally relied on open source software
because the licensing model for commercial software is not ideal for utility
computing.
• The primary opportunity is either for open source to remain popular or
simply for commercial software companies to change their licensing
structure to better fit cloud computing. One can consider using both pay-
for-use and bulk-use licensing schemes to widen the business coverage.
• One customer’s bad behavior can affect the reputation of the entire cloud.
For instance, blacklisting of EC2 IP addresses by spam-prevention services
may limit smooth VM installation.
• An opportunity would be to create reputation-guarding services similar to
the “trusted e-mail” services currently offered (for a fee) to services hosted
on smaller ISPs. Another legal issue concerns the transfer of legal liability.
Cloud providers want legal liability to remain with the customer, and vice
versa. This problem must be solved at the SLA level.
3.7 CLOUD STORAGE OVERVIEW
• Cloud storage involves exactly what the name suggests—storing your data
with a cloud service provider rather than on a local system. As with other
cloud services, you access the data stored on the cloud via an Internet
link.
• A cloud storage system just needs one data server connected to the
Internet. A subscriber copies files to the server over the Internet, which
then records the data. When a client wants to retrieve the data, he or she
accesses the data server with a web-based interface, and the server then
either sends the files back to the client or allows the client to access and
manipulate the data itself.

Figure A cloud service provider can simply add more commodity hard drives to
increase the organization’s capacity.
• More typically, however, cloud storage systems utilize dozens or hundreds
of data servers. Because servers require maintenance or repair, it is
necessary to store the saved data on multiple machines, providing
redundancy. Without that redundancy, cloud storage systems couldn’t
assure clients that they could access their information at any given time.
• Most systems store the same data on servers using different power
supplies. That way, clients can still access their data even if a power
supply fails.
• Many clients use cloud storage not because they’ve run out of room
locally, but for safety. If something happens to their building, then they
haven’t lost all their data.
3.7.1 Storage as a Service
• The term Storage as a Service (another Software as a Service, or SaaS,
acronym) means that a third-party provider rents space on their storage to
end users who lack the budget or capital budget to pay for it on their own.
It is also ideal when technical personnel are not available or have
inadequate knowledge to implement and maintain that storage
infrastructure.
• Storage service providers are nothing new, but given the complexity of
current backup, replication, and disaster recovery needs, the service has
become popular, especially among small and medium-sized businesses.
• The biggest advantage to SaaS is cost savings. Storage is rented from the
provider using a cost-per-gigabyte-stored or cost-per-data-transferred
model. The end user doesn’t have to pay for infrastructure; they simply
pay for how much they transfer and save on the provider’s servers.

Internet

Figure Clients rent storage capacity from cloud storage vendors.


• A customer uses client software to specify the backup set and then
transfers data across a WAN. When data loss occurs, the customer can
retrieve the lost data from the service provider.
3.7.2 Providers
• There are hundreds of cloud storage providers on the Web, and more seem
to be added each day. Not only are there general-purpose storage
providers, but there are some that are very specialized in what they store.
Some examples of specialized cloud providers are:
Google Docs allows users to upload documents, spreadsheets, and
presentations to Google’s data servers. Those files can then be edited using
a Google application.
Web email providers like Gmail, Hotmail, and Yahoo! Mail store email
messages on their own servers. Users can access their email from
computers and other devices connected to the Internet.
Flickr and Picasa host millions of digital photographs. Users can create
their own online photo albums.
YouTube hosts millions of user-uploaded video files.
Hostmonster and GoDaddy store files and data for many client web sites.
Facebook and MySpace are social networking sites and allow members to
post pictures and other content. That content is stored on the company’s
servers.
MediaMax and Strongspace offer storage space for any kind of digital
data.
3.7.3 Security
To secure data, most systems use a combination of techniques:
Encryption A complex algorithm is used to encode information. To
decode the encrypted files, a user needs the encryption key. While it’s
possible to crack encrypted information, it’s very difficult and most hackers
don’t have access to the amount of computer processing power they would
need to crack the code.
Authentication processes This requires a user to create a name and
password.
Authorization practices The client lists the people who are authorized to
access information stored on the cloud system. Many corporations have
multiple levels of authorization. For example, a front-line employee might
have limited access to data stored on the cloud and the head of the IT
department might have complete and free access to everything.

Figure Encryption and authentication are two security measures you can use
to keep your data safe on a cloud storage provider.
• But even with these measures in place, there are still concerns that data
stored on a remote system is vulnerable. There is always the concern that
a hacker will find a way into the secure system and access the data.
• Also, a disgruntled employee could alter or destroy the data using his or
her own access credentials.
3.7.4 Reliability
• The other concern is reliability. If a cloud storage system is unreliable, it
becomes a liability.
• No one wants to save data on an unstable system, nor would they trust a
company that is financially unstable.
• Most cloud storage providers try to address the reliability concern through
redundancy, but the possibility still exists that the system could crash and
leave clients with no way to access their saved data.
• Reputation is important to cloud storage providers. If there is a perception
that the provider is unreliable, they won’t have many clients. And if they
are unreliable, they won’t be around long, as there are so many players in
the market.
3.7.5 Advantages
• Cloud storage is becoming an increasingly attractive solution for
organizations. That’s because with cloud storage, data resides on the Web,
located across storage systems rather than at a designated corporate
hosting site. Cloud storage providers balance server loads and move data
among various datacenters, ensuring that information is stored close—and
thereby available quickly—to where it is used.
• Storing data on the cloud is advantageous, because it allows you to protect
your data in case there’s a disaster. You may have backup files of your
critical information, but if there is a fire or a hurricane wipes out your
organization, having the backups stored locally doesn’t help. Having your
data stored off- site can be the difference between closing your door for
good or being down for a few days or weeks.

Figure If there is a catastrophe at your organization, having your files backed


up at a cloud storage provider means you won’t have lost all your
data.
• Which storage vendor to go with can be a complex issue, and how your
technology interacts with the cloud can be complex. For instance, some
products are agent-based, and the application automatically transfers
information to the cloud via FTP. But others employ a web front end, and
the user has to select local files on their computer to transmit.
• Amazon S3 is the best-known storage solution, but other vendors might
be better for large enterprises. For instance, those who offer service level
agreements and direct access to customer support are critical for a business
moving storage to a service provider.
3.7.6 Cautions
• A mixed approach might be the best way to embrace the cloud, since cloud
storage is still immature. That is, don’t commit everything to the cloud, but
use it for a few, noncritical purposes.
• Large enterprises might have difficulty with vendors like Google or
Amazon, because they are forced to rewrite solutions for their applications
and there is a lack of portability.
• A vendor like 3tera, however, supports applications developed in LAMP,
Solaris, Java, or Windows.NET.
• The biggest deal-breakers when it comes to cloud storage seem to be price
and reliability.
• This is where you have to vet your vendor to ensure you’re getting a good
deal with quality service. One mistake on your vendor’s part could mean
irretrievable data.
• A lot of companies take the “appetizer” approach, testing one or two
services to see how well they mesh with their existing IT systems. It’s
important to make sure the services will provide what you need before you
commit too much to the cloud.

Figure Many companies test out a cloud storage vendor with one or
two services before committing too much to them. This “appetizer”
approach ensures the provider can give you what you want.
• Legal issues are also important. For instance, if you have copyrighted
material—like music or video—that you want to maintain on the cloud,
such an option might not be possible for licensing reasons.
• Also, keep in mind the accountability of your storage provider. Vendors
offer different assurances with the maintenance of data. They may offer the
service, but make sure you know exactly what your vendor will or will not
do in case of data loss or compromise.
• The best solution is to have multiple redundant systems: local and offsite
backup; sync and archive.

3.7.7 Outages
• Further, organizations have to be cognizant of the inherent danger of
storing their data on the Internet. Amazon S3, for example, dealt with a
massive outage in February 2008. The result was numerous client
applications going offline.
• Amazon reports that they have responded to the problem, adding capacity
to the authentication system blamed for the problem. They also note that
no data was lost, because they store multiple copies of every object in
several locations.
• The point remains, however, that clients were not able to access their data
as they had intended, and so you need to use caution when deciding to
pursue a cloud option.
3.7.8 Theft
• You should also keep in mind that your data could be stolen or viewed by
those who are not authorized to see it. Whenever your data is let out of
your own datacenter, you risk trouble from a security point of view.
Data

Internet

Cloud Storage Provider


Organization

Figure Whenever you let your data out of your organization, you give
up a measure of security.
• Also, because storage providers put everything into one pot, so to speak,
your company’s data could be stored next to a competitor’s, and the risk
of your competition seeing your proprietary information is real.
• If you do store your data on the cloud, make sure you’re encrypting data
and securing data transit with technologies like SSL.
3.8 CLOUD STORAGE PROVIDERS
• Amazon and Nirvanix are the current industry top dogs, but many others
are in the field, including some well-known names. Google is ready to
launch its own cloud storage solution called GDrive. EMC is readying a
storage solution, and IBM already has a number of cloud storage options
called Blue Cloud.
3.8.1 Amazon Simple Storage Service (S3)
• The best-known cloud storage service is Amazon’s Simple Storage Service
(S3), which launched in 2006. Amazon S3 is designed to make web-scale
computing easier for developers.
• Amazon S3 provides a simple web services interface that can
be used to store and retrieve any amount of data, at any time, from
anywhere on the Web. It gives any developer access to the same highly
scalable data storage infrastructure that Amazon uses to run its own global
network of web sites.
The service aims to maximize benefits of scale and to pass those benefits
on to developers.
• Amazon S3 is intentionally built with a minimal feature set that includes
the following functionality:
Write, read, and delete objects containing from 1 byte to 5 gigabytes of
data each. The number of objects that can be stored is unlimited.
Each object is stored and retrieved via a unique developer-assigned key.
Objects can be made private or public, and rights can be assigned to
specific users.
Uses standards-based REST and SOAP interfaces designed to work with
any Internet-development toolkit.

3.8.1.1 Design Requirements


Amazon built S3 to fulfill the following design requirements:
Scalable Amazon S3 can scale in terms of storage, request rate, and users
to support an unlimited number of web-scale applications.
Reliable Store data durably, with 99.99 percent availability. Amazon says
it does not allow any downtime.
Fast Amazon S3 was designed to be fast enough to support high-
performance applications. Server-side latency must be insignificant relative
to Internet latency. Any performance bottlenecks can be fixed by simply
adding nodes to the system.
Inexpensive Amazon S3 is built from inexpensive commodity hardware
components. As a result, frequent node failure is the norm and must not
affect the overall system. It must be hardware-agnostic, so that savings can
be captured as Amazon continues to drive down infrastructure costs.
Simple Building highly scalable, reliable, fast, and inexpensive storage is
difficult. Doing so in a way that makes it easy to use for any application
anywhere is more difficult. Amazon S3 must do both.
A forcing function for the design was that a single Amazon S3
distributed system must support the needs of both internal Amazon
applications and external developers of any application. This means that it
must be fast and reliable enough to run Amazon.com’s web sites, while
flexible enough that any developer can use it for any data storage need.
3.8.1.2 Design Principles
Amazon used the following principles of distributed system design to
meet Amazon S3 requirements:

• Decentralization It uses fully decentralized techniques to remove scaling


bottlenecks and single points of failure.
• Autonomy The system is designed such that individual components can
make decisions based on local information.
• Local responsibility Each individual component is responsible for
achieving its consistency; this is never the burden of its peers.

• Controlled concurrency Operations are designed such that no or limited


concurrency control is required.

• Failure toleration The system considers the failure of components to be


a normal mode of operation and continues operation with no or minimal
interruption.

• Controlled parallelism Abstractions used in the system are of such


granularity that parallelism can be used to improve performance and
robustness of recovery or the introduction of new nodes.

• Small, well-understood building blocks Do not try to provide a single


service that does everything for everyone, but instead build small
components that can be used as building blocks for other services.

• Symmetry Nodes in the system are identical in terms of functionality, and


require no or minimal node-specific configuration to function.

• Simplicity The system should be made as simple as possible, but no


simpler.
3.8.1.3 How S3 Works
• Amazon keeps its lips pretty tight about how S3 works, but according to
Amazon, S3’s design aims to provide scalability, high availability, and low
latency at commodity costs.
• S3 stores arbitrary objects at up to 5GB in size, and each is accompanied
by up to 2KB of metadata. Objects are organized by buckets. Each bucket
is owned by an AWS account and the buckets are identified by a unique,
user- assigned key.
Metadata 2KB Data (up to 5GB)

Metadata 2KB Data (up to 5GB)

Metadata 2KB Data (up to 5GB)


Figure Multiple objects are stored in buckets in Amazon S3.
• Buckets and objects are created, listed, and retrieved using either a REST-
style or SOAP interface. Objects can also be retrieved using the HTTP
GET interface or via BitTorrent. An access control list restricts who can
access the data in each bucket.
• Bucket names and keys are formulated so that they can be accessed using
HTTP.
• Requests are authorized using an access control list associated with each
bucket and object, for instance:
http://s3.amazonaws.com/examplebucket/examplekey
http://examplebucket.s3.amazonaws.com/examplekey

• The Amazon AWS Authentication tools allow the bucket owner to create
an authenticated URL with a set amount of time that the URL will be valid.
For instance, you could create a link to your data on the cloud, give that
link to someone else, and they could access your data for an amount of
time you predetermine, be it 10 minutes or 10 hours.
• Bucket items can also be accessed via a BitTorrent feed, enabling S3 to act
as a seed for the client. Buckets can also be set up to save HTTP log
information to another bucket. This information can be used for later data
mining.
• “Amazon S3 is based on the idea that quality Internet-based storage should
be taken for granted,” said Andy Jassy, vice president of Amazon Web
Services. “It helps free developers from worrying about where they are
going to store data, whether it will be safe and secure, if it will be available
when they need it, the costs associated with server maintenance, or
whether they have enough storage available. Amazon S3 enables
developers to focus on innovating with data, rather than figuring out how
to store it.”
• S3 lets developers pay only for what they consume, and there is no
minimum fee.
• Developers pay just $0.15 per gigabyte of storage per month and $0.20
per gigabyte of data transferred. This might not seem like a lot of money
but storing 1TB would be $1800 per year alone, whereas an internal 1TB
drive these days costs about $100 to own outright.
• So it’s really not so much about the cost of storage as it is about the total
cost to serve.
3.8.1.4 Early S3 Applications
• The science team at the University of California Berkeley responsible for

You might also like