0% found this document useful (0 votes)

13 views24 pages

Cloud Unit III

The document provides an overview of cloud data storage, detailing the evolution of storage technology from early methods like punch cards to modern solutions such as cloud storage and solid-state drives (SSDs). It discusses various storage models, including file, block, and object storage, as well as the differences between file systems and database management systems (DBMS). Additionally, it covers parallel file systems, the Google File System (GFS), and the advantages and disadvantages of these technologies.

Uploaded by

Vijay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views24 pages

Cloud Unit III

Uploaded by

Vijay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

UNIT III

CLOUD DATA STORAGE - INTRODUCTION TO CLOUD DATA

STORAGE

THE EVOLUTION OF STORAGE TECHNOLOGY:

The evolution of data storage technology has moved from bulky and slow
mechanical methods to compact and efficient digital solutions. Early forms like
punch cards and magnetic tapes gave way to hard disk drives (HDDs), which were
then challenged by the faster and more durable solid-state drives (SSDs). Cloud
storage represents a more recent shift, offering accessibility and scalability, while
also presenting new challenges in terms of data security and privacy.

There are several different types of data storage throughout history including punch
cards, magnetic tape, hard disk drives, and cloud storage.

● Punch cards & magnetic tapes

● Hard disk drive (HDD)
● CDs, DVDs & Blue-ray
● Flash storage and solid-state drives (SSDs)
● Network-attached storage (NAS) & storage area networks (SAN)
● Cloud storage

Punch Cards:
The earliest form of digital data storage, invented in the 19th century and used
extensively in early computers for storing and processing information.

Magnetic Tape:
Introduced in the 1950s, magnetic tape offered higher storage capacity and was
used for data archiving and backup.

Magnetic Storage:
Hard Disk Drives (HDDs):
Developed in the 1950s, HDDs provided faster access to data compared to tape,
becoming a standard for personal computers.

Floppy Disks:
Introduced in the 1970s, floppy disks offered portable storage, though with limited
capacity.

Zip Drives:
A later iteration of removable storage with higher capacity than floppy disks, but
ultimately not as successful as hard drives, according to Platinum Data Recovery.

Optical Storage:
CDs, DVDs, Blu-ray: These technologies used lasers to read and write data onto
optical discs, offering higher storage capacities than previous media.

Solid-State Storage:
Flash Memory and Solid-State Drives (SSDs): Flash memory, like that found in
SSDs, offers faster data access speeds and greater durability than HDDs, becoming
increasingly popular for laptops and desktops.

Networked and Cloud Storage:

Network-Attached Storage (NAS) and Storage Area Networks (SAN):
These technologies allow multiple users to access storage over a network, enabling
centralized data management and sharing.

Cloud Storage:
Cloud computing offers remote storage solutions, providing accessibility and
scalability while potentially reducing the need for local storage devices.

Cloud Storage:
Cloud computing and cloud storage refers to the practice of saving data on remote
servers accessible via the internet, instead of on local storage devices or personal
computers.
The primary benefits of cloud storage include scalability, cost-effectiveness, and
accessibility, as it allows users to store vast amounts of data while offering the
flexibility to access it from any location with internet connectivity.

STORAGE MODELS:

Storage models define how data is organized and accessed in a data store. They can
be broadly categorized into data models (logical aspects) and storage models
(physical layout). Common types include file storage, block storage, and object
storage, each suited for different needs and applications. Cloud storage adds
another layer with models like public, private, and hybrid clouds, offering various
levels of security and control.

1. Data Model vs. Storage Model:

● Data Model:
Focuses on the logical structure of data within a database, defining how data
is organized and related.
● Storage Model:
Describes the physical layout of data on storage media (e.g., hard drive,
cloud).

2. Common Storage Models:

● File Storage (File-based):

Data is stored in files organized within directories and subdirectories, similar
to how documents are organized on a computer.
● Block Storage:
Data is stored in fixed-size blocks, often used for databases and virtual
machines, where the operating system manages the storage.
● Object Storage:
Data is stored as objects with associated metadata, suitable for unstructured
data like images, videos, and backups.

3. Cloud Storage Models:

● Public Cloud Storage:
Data is stored on the infrastructure of a third-party provider, accessible over
the internet.
● Private Cloud Storage:
Data is stored on an organization's own infrastructure, providing more
control and security.
● Hybrid Cloud Storage:
Combines public and private cloud storage, offering flexibility and control.
● Community Cloud Storage:
Shared among a group of organizations with similar needs and requirements.

4. Cloud Storage Models within Cloud Computing:

● Instance Storage:
Virtual disks in the cloud, often used in traditional virtualization
environments.
● Volume Storage:
Storage accessed via a network, similar to a SAN (Storage Area Network)
but in the cloud.
● Object Storage:
Web-scale storage optimized for handling large amounts of unstructured
data.

5. Examples of Storage Devices:

● Hard Disk Drives (HDDs): Traditional magnetic storage.

● Solid State Drives (SSDs): Faster, flash-based storage.
● USB Flash Drives: Portable storage devices.
● Memory Cards: Used in cameras, phones, etc.
● Network-Attached Storage (NAS): File-based storage accessible over a
network.
● Storage Area Network (SAN): Block-level storage accessed over a network,
often used for high-performance applications.
SOLID STATE DISKS:

Solid State Drives (SSDs) are storage devices that use flash memory to store data,
unlike traditional Hard Disk Drives (HDDs) which use spinning disks and moving
parts. This difference in technology gives SSDs several advantages, including
faster performance, greater durability, and lower power consumption.

Key Features and Benefits of SSDs:

Speed:
SSDs offer significantly faster read and write speeds compared to HDDs, leading
to quicker boot times, faster application loading, and improved overall system
responsiveness.

Durability:
Without moving parts, SSDs are more resistant to physical shock and vibration,
making them more reliable for portable devices like laptops.

Power Consumption:
SSDs consume less power than HDDs, which can translate to longer battery life in
laptops and reduced energy costs in data centers.

Size and Weight:

SSDs are typically smaller and lighter than HDDs, allowing for thinner and more
portable devices.

Noise:
SSDs are silent, as they have no moving parts that can create noise, unlike HDDs.

Drawbacks of SSDs:

Cost: SSDs are generally more expensive per gigabyte compared to HDDs.
Storage Capacity: While SSD capacities are increasing, they may not yet match the
vast storage options available with HDDs.
Limited Write Cycles: Flash memory used in SSDs has a finite number of write
cycles before it begins to degrade, though this is less of a concern with modern
SSDs.

In summary, SSDs offer superior performance, durability, and power efficiency

compared to traditional HDDs, making them a popular choice for modern
computing devices. While the initial cost and capacity limitations may be a factor,
the benefits of SSDs often outweigh these drawbacks for most users.

FILE SYSTEM AND DATABASES:

A file system and a DBMS are two kinds of data management systems that are
used in different capacities and possess different characteristics.

A File System is a way of organizing files into groups and folders and then storing
them in a storage device. It provides the media that stores data as well as enables
users to perform procedures such as reading, writing, and even erasure.

On the other hand, DBMS is a more elaborate software application that is solely
charged with the responsibility of managing large amounts of structured data. It
provides functionalities such as query, index, transaction, as well as data integrity.

Although the file system serves well for the purpose of data storage for
applications where data is to be stored simply and does not require any great
organization, DBMS is more appropriate for applications where data needs to be
stored and optimized for organizational and structural needs, security, etc.

File System

The file system is basically a way of arranging the files in a storage medium like a
hard disk. The file system organizes the files and helps in the retrieval of files
when they are required. File systems consist of different files which are grouped
into directories. The directories further contain other folders and files. The file
system performs basic operations like management, file naming, giving access
rules, etc.
Example: NTFS(New Technology File System) , EXT(Extended File System).

DBMS ( Database Management System)

Database Management System is basically software that manages the collection of

related data. It is used for storing data and retrieving the data effectively when it is
needed. It also provides proper security measures for protecting the data from
unauthorized access. In Database Management System the data can be fetched by
SQL queries and relational algebra. It also provides mechanisms for data recovery
and data backup.

Example:

Oracle, MySQL, MS SQL server.

Difference Between File System and DBMS:

Basics File System DBMS

The file system is a way of

DBMS is software for
Structure arranging the files in a storage
managing the database.
medium within a computer.

Redundant data can be present in In DBMS there is no

Data Redundancy
a file system. redundant data.

It doesn't provide Inbuilt It provides in house tools

Backup and Recovery mechanism for backup and for backup and recovery
recovery of data if it is lost. of data even if it is lost.

Efficient query
There is no efficient query
Query processing processing is there in
processing in the file system.
DBMS.
There is more data
There is less data consistency in
Consistency consistency because of the
the file system.
process of normalization .

It has more complexity in

It is less complex as compared to
Complexity handling as compared to
DBMS.
the file system.

DBMS has more security

File systems provide less security
Security Constraints mechanisms as compared
in comparison to DBMS.
to file systems.

It has a comparatively
Cost It is less expensive than DBMS. higher cost than a file
system.

In DBMS data
independence exists,
mainly of two types:
Data Independence There is no data independence.

1) Logical Data
Independence .
2)Physical Data
Independence.

Only one user can access data at a Multiple users can access
User Access
time. data at a time.

The user has to write

The users are not required to
Meaning procedures for managing
write procedures.
databases

Data is distributed in many files. Due to centralized nature

Sharing
So, it is not easy to share data. data sharing is easy

It give details of storage and It hides the internal

Data Abstraction
representation of data details of Database

Integrity Constraints are difficult Integrity constraints are

Integrity Constraints
to implement easy to implement
To access data in a file , user
No such attributes are
Attribute s requires attributes such as file
required.
name, file location.

Example Cobol , C++ Oracle , SQL Server

GENERAL PARALLEL FILE SYSTEM:

A general parallel file system in cloud computing allows multiple computing nodes to access the
same file system concurrently, enabling high-performance data access for large-scale
applications.

Examples include GPFS (now IBM Storage Scale) and Lustre, which are used to manage vast
amounts of data in distributed environments.

Key Characteristics of Parallel File Systems:

● Scalability:
They can handle growing data volumes and increasing performance demands by adding
more storage nodes.
● High Performance:
Parallel access to data across multiple nodes maximizes throughput and minimizes
latency, crucial for applications like scientific computing and data analytics.
● Data Distribution:
Files are typically broken into smaller blocks that are distributed across multiple storage
devices, enabling parallel access.
● High Availability:
Features like data replication and redundancy ensure data durability and availability even
if some nodes fail.
● Compatibility and Integration:
They are designed to integrate with various cloud platforms and applications, supporting
diverse workloads and access patterns.

Popular Parallel File Systems in Cloud Computing:

● IBM Storage Scale (formerly GPFS):

A high-performance clustered file system known for its scalability, availability, and
advanced data management features like snapshots and replication.
● Lustre:
An open-source parallel file system particularly well-suited for high-performance
computing (HPC) and large-scale data processing.
● Google Cloud Managed Lustre:
A fully managed Lustre service on Google Cloud, offering scalability and high
performance for HPC workloads.
● Amazon FSx for Lustre:
A service on AWS that provides a fully managed Lustre file system, enabling
high-performance data access for compute-intensive workloads.

Benefits in Cloud Environments:

● Efficient Data Sharing:

Multiple cloud instances can access the same data concurrently, improving resource
utilization and reducing data redundancy.
● Optimized for Big Data:
Parallel file systems are ideal for handling large datasets in cloud-based analytics,
machine learning, and other data-intensive applications.
● Reduced Costs:
By optimizing data access and storage, parallel file systems can contribute to cost savings
in cloud environments.
● Simplified Management:
Managed services like Google Cloud Managed Lustre and Amazon FSx for Lustre
abstract away the complexities of managing parallel file systems, making them easier to
deploy and use in the cloud.

GOOGLE FILE SYSTEM:

Google Inc. developed the Google File System (GFS), a scalable distributed file
system (DFS), to meet the company's growing data processing needs. GFS offers
fault tolerance, dependability, scalability, availability, and performance to big
networks and connected nodes. GFS is made up of a number of storage systems
constructed from inexpensive commodity hardware parts. The search engine,
which creates enormous volumes of data that must be kept, is only one example of
how it is customized to meet Google's various data use and storage requirements.

The Google File System reduced hardware flaws while gains of commercially
available servers.

GoogleFS is another name for GFS. It manages two types of data namely File
metadata and File Data.

The GFS node cluster consists of a single master and several chunk servers that
various client systems regularly access. On local discs, chunk servers keep data in
the form of Linux files. Large (64 MB) pieces of the stored data are split up and
replicated at least three times around the network. Reduced network overhead
results from the greater chunk size.

Without hindering applications, GFS is made to meet Google's huge cluster

requirements. Hierarchical directories with path names are used to store files. The
master is in charge of managing metadata, including namespace, access control,
and mapping data. The master communicates with each chunk server by timed
heartbeat messages and keeps track of its status updates.

More than 1,000 nodes with 300 TB of disc storage capacity make up the largest
GFS clusters. This is available for constant access by hundreds of clients.
Components of GFS

A group of computers makes up GFS. A cluster is just a group of connected

computers. There could be hundreds or even thousands of computers in each
cluster. There are three basic entities included in any GFS cluster as follows:

GFS Clients: They can be computer programs or applications which may be used
to request files. Requests may be made to access and modify already-existing files
or add new files to the system.
GFS Master Server: It serves as the cluster's coordinator. It preserves a record of
the cluster's actions in an operation log. Additionally, it keeps track of the data that
describes chunks, or metadata. The chunks' place in the overall file and which files
they belong to are indicated by the metadata to the master server.
GFS Chunk Servers: They are the GFS's workhorses. They keep 64 MB-sized file
chunks. The master server does not receive any chunks from the chunk servers.
Instead, they directly deliver the client the desired chunks. The GFS makes
numerous copies of each chunk and stores them on various chunk servers in order
to assure stability; the default is three copies. Every replica is referred to as one.
Features of GFS
● Namespace management and locking.
● Fault tolerance.
● Reduced client and master interaction because of large chunk server size.
● High availability.
● Critical data replication.
● Automatic and efficient data recovery.
● High aggregate throughput.

Advantages of GFS

● High accessibility Data is still accessible even if a few nodes fail.

(replication) Component failures are more common than not, as the saying
goes.
● Excessive throughput. many nodes operating concurrently.
● Dependable storing. Data that has been corrupted can be found and
duplicated.

Disadvantages of GFS

● Not the best fit for small files.

● Master may act as a bottleneck.
● unable to type at random.
● Suitable for procedures or data that are written once and only read
(appended) later.

LOCKS:

Locks are essential for maintaining data consistency and preventing race conditions
in distributed systems.
Chubby provides a robust mechanism for implementing distributed locks, ensuring
that only one client can hold a lock on a resource at a time.
Distributed systems often need to synchronize access to shared resources (e.g.,
files, databases, caches) to prevent data corruption or inconsistencies.
Types of locks:

● Advisory locks: These locks don't prevent access to resources but rather
indicate that a client is using the resource.
● Enforced locks: These locks prevent access to resources based on the lock
status.

CHUBBY - A LOCKING SERVICE:

In cloud computing, Chubby is a distributed lock service developed by Google that

provides reliable storage and coarse-grained locking for loosely coupled systems.

Chubby is a robust and reliable distributed lock service that plays a crucial role in the
operation of many Google services and other large-scale distributed systems by
providing a foundation for coordination and consistency.

● Chubby is a specific implementation of a distributed lock service, while

locks are a general concept in distributed computing.
● Chubby uses locks to enable coordination and synchronization in
distributed systems.
● By using Chubby, developers can build more reliable and robust
distributed applications.

Features:

● Fault Tolerance:
Chubby is designed to be highly available and reliable, with the ability to elect
a new master if the current one fails.
● Coarse-grained Locking:
Chubby's locks are relatively large, meaning they control access to entire files
or directories rather than smaller pieces of data.
● Event Notifications:
Clients can subscribe to events related to files and locks, such as modification
of file contents or changes in lock status.
● Low-volume Storage:
Chubby can also be used as a repository for configuration data and other small
amounts of information.

How it Works:

● Master Election:
A distributed consensus protocol (like Paxos) is used to elect a master server
within the Chubby cell.

● Client Interaction:

Clients interact with the master server to acquire and release locks, read and
write data, and subscribe to events.

● Event Handling:

When a client subscribes to an event, the Chubby server notifies the client
when the event occurs, allowing the client to react accordingly.

● Cache Invalidation:
Clients cache file data and metadata, and updates invalidate the cache,
requiring clients to refresh their data.

NoSQL DATABASES:

A NoSQL (Not Only SQL) database is a type of database that deviates from the
traditional relational database model. It's designed to handle large volumes of
diverse, unstructured, and rapidly changing data, offering flexibility and scalability
not always found in SQL databases.

Characteristics of NoSQL Databases:

● Non-relational:

Data is not stored in tables with predefined relationships, allowing for

greater flexibility in data structure.
● Flexible schema:

NoSQL databases can easily adapt to changes in data structure and

accommodate new data types without requiring major schema modifications.

● Scalability:

NoSQL databases are designed to handle large volumes of data and high
traffic loads by distributing data across multiple servers (horizontal scaling).

● Various data models:

NoSQL offers diverse models like document (JSON documents), key-value

(simple key-value pairs), wide-column (data organized into columns), and
graph (relationships between data points).

● BASE compliance:
NoSQL databases often prioritize availability, eventual consistency, and soft
state (temporary inconsistencies) over strict ACID (Atomicity, Consistency,
Isolation, Durability) compliance found in SQL databases.

● "Not Only SQL":

NoSQL databases can support SQL-like query languages or have their own
query languages optimized for their specific data models.

DATA STORAGE FOR ONLINE TRANSACTION PROCESSING

SYSTEM:

Online Transaction Processing (OLTP) systems rely on databases designed for

handling a high volume of small, frequent transactions, often involving real-time
updates and user interactions. These databases are optimized for quick data
retrieval and modification, ensuring data integrity and consistency through
mechanisms like ACID (Atomicity, Consistency, Isolation, Durability)
transactions.

Characteristics of OLTP:

● Process a large number of relatively simple transactions: Usually insertions,

updates, and deletions to data, as well as simple data queries (for example, a
balance check at an ATM).

● Enable multi-user access to the same data, while ensuring data integrity:
OLTP systems rely on concurrency algorithms to ensure that no two users
can change the same data at the same time and that all transactions are
carried out in the proper order. This prevents people from using online
reservation systems from double-booking the same room and protects
holders of jointly held bank accounts from accidental overdrafts.

● Emphasize very rapid processing, with response times measured in

milliseconds: The effectiveness of an OLTP system is measured by the total
number of transactions that can be carried out per second.
● Provide indexed data sets: These are used for rapid searching, retrieval, and
querying.

● Are available 24/7/365: Again, OLTP systems process huge numbers of

concurrent transactions, so any data loss or downtime can have significant
and costly repercussions. A complete data backup must be available for any
moment in time. OLTP systems require frequent regular backups and
constant incremental backups.

Types of Databases Used in OLTP:

● Relational Databases:
Traditional relational databases like MySQL, PostgreSQL, and Oracle are
commonly used for OLTP due to their proven reliability and support for
complex transactions.
● NoSQL Databases:
Some NoSQL databases, like those using key-value or document models, are
also suitable for specific OLTP workloads, particularly those requiring high
scalability and flexible schemas.
● In-Memory Databases:
In-memory OLTP solutions leverage system memory to store and process
transaction data, further reducing latency and improving performance.

Examples of OLTP Applications:

● E-commerce: Online shopping websites, payment processing, and order

management.
● Banking: ATM transactions, online banking, and fund transfers.
● Reservations: Airline and hotel booking systems, ticket purchasing.
● Inventory Management: Tracking stock levels and product availability.

BIGTABLE:
Bigtable is a fully managed, scalable NoSQL database service offered by Google
Cloud Platform (GCP). It's designed for handling massive amounts of structured
and semi-structured data with low latency and high throughput, making it suitable
for a wide range of applications like machine learning, operational analytics, and
user-facing applications

Characteristic of BigTable:

● NoSQL Database:
Bigtable is a NoSQL database, meaning it doesn't use the traditional
relational database model (tables with rows and columns).
● Wide-Column Store:
It's a wide-column store, where data is organized into columns that are
grouped into column families.
● Key-Value Store:
Data is accessed using a key-value pair structure, with the row key serving
as the primary index.
● Scalability:
Bigtable is designed to scale from terabytes to petabytes of data, handling
billions of rows and thousands of columns.
● Low Latency:
It offers low latency access to data, making it suitable for real-time
applications and high-speed data processing.
● Fully Managed:
Google handles the infrastructure, management, and maintenance of
Bigtable, allowing users to focus on their applications.
● Replication:
Bigtable supports data replication across multiple regions for high
availability and disaster recovery.
● Integration:
It integrates with other Google Cloud services like BigQuery, Dataflow, and
Dataproc, as well as the open-source HBase API.

Use Cases:
● Machine Learning:
Bigtable can be used as a storage engine for large datasets used in machine
learning models.
● Operational Analytics:
It's suitable for analyzing real-time data streams and generating insights for
operational monitoring and decision-making.
● Data Warehousing:
Bigtable can be used for data warehousing, especially when dealing with
large volumes of non-relational data and requiring real-time insights.

MEGASTORE:

Megastore is a scalable and highly available cloud storage system developed by

Google, designed for interactive online services. It combines the scalability of
NoSQL datastores with the convenience of traditional relational databases
(RDBMS). Megastore provides strong consistency guarantees (ACID properties)
across entity groups through synchronous replication across multiple data centers
using a modified Paxos algorithm.

Key features of Megastore:

● Scalability:
Megastore leverages data partitioning and replication to handle large
volumes of data and high traffic.
● High Availability:
It achieves high availability through synchronous replication across data
centers and seamless failover mechanisms.
● Strong Consistency:
Megastore provides ACID (Atomicity, Consistency, Isolation, Durability)
transactions, ensuring data consistency across multiple data centers.
● Entity Groups:
Data is organized into entity groups, allowing for efficient transaction
management and strong consistency guarantees within these groups.
● Modified Paxos:
Megastore uses a modified Paxos algorithm for consensus and replication
across data centers, ensuring fault tolerance and strong consistency.
● Built on Bigtable and Chubby:
Megastore is built on top of Google's Bigtable and Chubby infrastructure,
leveraging their scalability and reliability.
● Wide Range of Production Services:
It has been used to support a variety of Google's production services.

STORAGE RELIABILITY AT SCALE:

Storage reliability at scale refers to the ability of a storage system to consistently and
accurately store and retrieve data, even when dealing with a large volume of data and
a large number of storage components. It's crucial for large-scale systems to ensure
data integrity and availability despite the potential for component failures.

Key Components of Reliability:

● Data Integrity: Maintaining data accuracy and consistency.

● Data Availability: Ensuring data is accessible when needed, even during
component failures.
● Data Durability: Protecting data from loss or corruption over time.

Strategies for Enhancing Reliability:

● Redundancy: Using techniques like mirroring (e.g., two-way or three-way

mirroring) and RAID to replicate data across multiple storage devices.
● Fault Tolerance: Designing systems that can continue operating despite
failures in individual components.
● Fast Recovery: Implementing mechanisms for quickly rebuilding data after
a failure.
● Error Detection and Correction: Employing techniques to identify and
correct errors during data reads.
● Hardware Choice: Utilizing SSDs (Solid State Drives) instead of HDDs
(Hard Disk Drives) can improve reliability due to their faster read/write
speeds and lack of moving parts.
● Proper Architecture: Employing architectures like VAST Data's
Disaggregated Shared-Everything (DASE) can significantly enhance
resilience compared to traditional shared-nothing architectures
.
IBM Storage Scale Example:

● Global Namespace: Provides a single, unified view of all storage resources.

● High Availability: Achieves high availability through replication and other
mechanisms.
● Performance: Designed to handle high-performance computing,
data-intensive workloads, and AI applications.
● Scalability: Offers the ability to scale storage capacity and performance as
needed.
● Data Protection: Includes features to protect data from various types of
failures and attacks.

It Application Lesson 10
No ratings yet
It Application Lesson 10
21 pages
Presentation1 1
No ratings yet
Presentation1 1
31 pages
Unit Iii CC
No ratings yet
Unit Iii CC
23 pages
Yr 8 - Storage Devices & Media
No ratings yet
Yr 8 - Storage Devices & Media
29 pages
CC Unit 04
No ratings yet
CC Unit 04
8 pages
Theory Assignment 03
No ratings yet
Theory Assignment 03
17 pages
Types of Storages
No ratings yet
Types of Storages
11 pages
DE Unit-4
No ratings yet
DE Unit-4
35 pages
Evolution of Cloud Storage Systems
No ratings yet
Evolution of Cloud Storage Systems
46 pages
ISM MODULE 1 Introduction To Information Storage
No ratings yet
ISM MODULE 1 Introduction To Information Storage
46 pages
De Unit 4
No ratings yet
De Unit 4
33 pages
Unit 4
No ratings yet
Unit 4
8 pages
Unit 1 StorageTechnologies
No ratings yet
Unit 1 StorageTechnologies
52 pages
Presentation ICT
100% (1)
Presentation ICT
22 pages
Lesson 4
No ratings yet
Lesson 4
80 pages
Storage Devices
No ratings yet
Storage Devices
36 pages
De Unit-4
No ratings yet
De Unit-4
20 pages
Notes - UNIT 2-Compressed
No ratings yet
Notes - UNIT 2-Compressed
18 pages
Lecture Notes ICT
No ratings yet
Lecture Notes ICT
12 pages
Storage by Abdul Naseer 8-B
No ratings yet
Storage by Abdul Naseer 8-B
9 pages
Storage Systems-UNIT-Ist
No ratings yet
Storage Systems-UNIT-Ist
20 pages
Unit 1
No ratings yet
Unit 1
32 pages
Storage Lect III
No ratings yet
Storage Lect III
13 pages
Understanding Digital Storage Systems
No ratings yet
Understanding Digital Storage Systems
6 pages
Data Storage 3.3
No ratings yet
Data Storage 3.3
9 pages
Secondary Storage Essentials
No ratings yet
Secondary Storage Essentials
45 pages
Unit 1 StorageTechnologies
No ratings yet
Unit 1 StorageTechnologies
48 pages
Computer Storage Devices
No ratings yet
Computer Storage Devices
11 pages
Cloud Storage Seminar
No ratings yet
Cloud Storage Seminar
11 pages
Chapter 4 - Secondary Storage!
No ratings yet
Chapter 4 - Secondary Storage!
45 pages
CC Unit-V
No ratings yet
CC Unit-V
6 pages
Data Storage
No ratings yet
Data Storage
6 pages
What Are The Differences Between The CLI & GUI Interfaces? 1. Nature of Interaction
No ratings yet
What Are The Differences Between The CLI & GUI Interfaces? 1. Nature of Interaction
17 pages
Digital Data Storage Devices Guide
No ratings yet
Digital Data Storage Devices Guide
27 pages
Chapter 4 - Secondary Storage
100% (1)
Chapter 4 - Secondary Storage
45 pages
Chapter 4 - Secondary Storage
No ratings yet
Chapter 4 - Secondary Storage
49 pages
What Is Storage
No ratings yet
What Is Storage
6 pages
Input, Output, and Storage Devices Overview
No ratings yet
Input, Output, and Storage Devices Overview
7 pages
Introduction To Computing: Digital Storage
No ratings yet
Introduction To Computing: Digital Storage
33 pages
Data Storage
No ratings yet
Data Storage
4 pages
Storage Devices - 1
No ratings yet
Storage Devices - 1
5 pages
CSC 222 Lect I
No ratings yet
CSC 222 Lect I
18 pages
Understanding Storage Technology Basics
No ratings yet
Understanding Storage Technology Basics
13 pages
CHAPTER 3 - Data Storage
No ratings yet
CHAPTER 3 - Data Storage
13 pages
Chapter4-Cloud Storage
No ratings yet
Chapter4-Cloud Storage
29 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
30 pages
Lec 4 Storage Devies
No ratings yet
Lec 4 Storage Devies
9 pages
Ccs367-Storage Technologies-Unit - I
No ratings yet
Ccs367-Storage Technologies-Unit - I
53 pages
DBMS
No ratings yet
DBMS
6 pages
Cloud Storage
No ratings yet
Cloud Storage
34 pages
Introduction To Computers and Information Technology. Chapter 3 Storage Basics
No ratings yet
Introduction To Computers and Information Technology. Chapter 3 Storage Basics
27 pages
Data Management and Storage in ICT Infrastructure
No ratings yet
Data Management and Storage in ICT Infrastructure
43 pages
Random Access Memory (RAM)
No ratings yet
Random Access Memory (RAM)
9 pages
Characteristics of Storage Systems
No ratings yet
Characteristics of Storage Systems
26 pages
2023KS Sivakumar-DNA Future of Data Storage
No ratings yet
2023KS Sivakumar-DNA Future of Data Storage
11 pages
Level1 CH08
No ratings yet
Level1 CH08
34 pages
A Project Report On Online Exam Portal
No ratings yet
A Project Report On Online Exam Portal
70 pages
C Abapd 2309-Demo
No ratings yet
C Abapd 2309-Demo
3 pages
Database Mangement Cs 12 PDF
No ratings yet
Database Mangement Cs 12 PDF
17 pages
Ai-102 Guide Lat4
No ratings yet
Ai-102 Guide Lat4
5 pages
S.Venkata Sai: Skills
No ratings yet
S.Venkata Sai: Skills
2 pages
Data Mining MCQs and Answers
No ratings yet
Data Mining MCQs and Answers
21 pages
SQL Quiz for Database Students
No ratings yet
SQL Quiz for Database Students
31 pages
Chapter 3
No ratings yet
Chapter 3
38 pages
Basic CREATE TABLE Mysql Statement
No ratings yet
Basic CREATE TABLE Mysql Statement
3 pages
Advanced Database Systems Exam
No ratings yet
Advanced Database Systems Exam
1 page
Understanding Data, Information, and Knowledge
No ratings yet
Understanding Data, Information, and Knowledge
93 pages
How To Use Script Logic
67% (3)
How To Use Script Logic
91 pages
PL-SQL Package
No ratings yet
PL-SQL Package
2 pages
Lab Maunal 6 (Stored Procedures and Views)
No ratings yet
Lab Maunal 6 (Stored Procedures and Views)
16 pages
Practice 1 - IZO 083
No ratings yet
Practice 1 - IZO 083
138 pages
BSC Iiyr IV Sem Dbms Total Notes
No ratings yet
BSC Iiyr IV Sem Dbms Total Notes
110 pages
Assignment 1 2
No ratings yet
Assignment 1 2
4 pages
2nd PUC Computer Science Paper 2
No ratings yet
2nd PUC Computer Science Paper 2
2 pages
MD Aqueel Ahmad
No ratings yet
MD Aqueel Ahmad
4 pages
B49 - Experiment No.1 (DWM)
No ratings yet
B49 - Experiment No.1 (DWM)
3 pages
Splunk Fundamentals 3: Course Topics
No ratings yet
Splunk Fundamentals 3: Course Topics
1 page
Ra Sapnotes Res DWLD
No ratings yet
Ra Sapnotes Res DWLD
24 pages
Linux Command Overview: MAN & TEE
No ratings yet
Linux Command Overview: MAN & TEE
8 pages
Kiran Voorukonda - SF CPQ
No ratings yet
Kiran Voorukonda - SF CPQ
9 pages
Grok System Design Interview
No ratings yet
Grok System Design Interview
163 pages
Writing Business Rules for Data Models
No ratings yet
Writing Business Rules for Data Models
28 pages
Marc-Andre Giroux - Production Ready GraphQL (2020) PDF
No ratings yet
Marc-Andre Giroux - Production Ready GraphQL (2020) PDF
186 pages
UNIT 1 Web Tech
No ratings yet
UNIT 1 Web Tech
52 pages
Chapter 8 - Introduction To Structured Query Language (SQL)
100% (2)
Chapter 8 - Introduction To Structured Query Language (SQL)
7 pages
Wor ch3 Python SQL
No ratings yet
Wor ch3 Python SQL
4 pages

Cloud Unit III

Uploaded by

Cloud Unit III

Uploaded by

UNIT III

CLOUD DATA STORAGE - INTRODUCTION TO CLOUD DATA

THE EVOLUTION OF STORAGE TECHNOLOGY:

●​ Punch cards & magnetic tapes

Networked and Cloud Storage:

1. Data Model vs. Storage Model:

2. Common Storage Models:

●​ File Storage (File-based):

3. Cloud Storage Models:

4. Cloud Storage Models within Cloud Computing:

5. Examples of Storage Devices:

●​ Hard Disk Drives (HDDs): Traditional magnetic storage.

Key Features and Benefits of SSDs:

Size and Weight:

In summary, SSDs offer superior performance, durability, and power efficiency

FILE SYSTEM AND DATABASES:

DBMS ( Database Management System)

Database Management System is basically software that manages the collection of

Oracle, MySQL, MS SQL server.

Basics File System DBMS

The file system is a way of

Redundant data can be present in In DBMS there is no

It doesn't provide Inbuilt It provides in house tools

It has more complexity in

DBMS has more security

The user has to write

Data is distributed in many files. Due to centralized nature

It give details of storage and It hides the internal

Integrity Constraints are difficult Integrity constraints are

Example Cobol , C++ Oracle , SQL Server

GENERAL PARALLEL FILE SYSTEM:

Key Characteristics of Parallel File Systems:

Popular Parallel File Systems in Cloud Computing:

●​ IBM Storage Scale (formerly GPFS):

Benefits in Cloud Environments:

●​ Efficient Data Sharing:

GOOGLE FILE SYSTEM:

Without hindering applications, GFS is made to meet Google's huge cluster

A group of computers makes up GFS. A cluster is just a group of connected

●​ High accessibility Data is still accessible even if a few nodes fail.

●​ Not the best fit for small files.

CHUBBY - A LOCKING SERVICE:

In cloud computing, Chubby is a distributed lock service developed by Google that

●​ Chubby is a specific implementation of a distributed lock service, while

Characteristics of NoSQL Databases:

Data is not stored in tables with predefined relationships, allowing for

NoSQL databases can easily adapt to changes in data structure and

●​ Various data models:

NoSQL offers diverse models like document (JSON documents), key-value

●​ "Not Only SQL":

DATA STORAGE FOR ONLINE TRANSACTION PROCESSING

Online Transaction Processing (OLTP) systems rely on databases designed for

●​ Process a large number of relatively simple transactions: Usually insertions,

●​ Emphasize very rapid processing, with response times measured in

●​ Are available 24/7/365: Again, OLTP systems process huge numbers of

Types of Databases Used in OLTP:

Examples of OLTP Applications:

●​ E-commerce: Online shopping websites, payment processing, and order

Megastore is a scalable and highly available cloud storage system developed by

Key features of Megastore:

STORAGE RELIABILITY AT SCALE:

Key Components of Reliability:

●​ Data Integrity: Maintaining data accuracy and consistency.

Strategies for Enhancing Reliability:

●​ Redundancy: Using techniques like mirroring (e.g., two-way or three-way

●​ Global Namespace: Provides a single, unified view of all storage resources.

You might also like

● Punch cards & magnetic tapes

● File Storage (File-based):

● Hard Disk Drives (HDDs): Traditional magnetic storage.

● IBM Storage Scale (formerly GPFS):

● Efficient Data Sharing:

● High accessibility Data is still accessible even if a few nodes fail.

● Not the best fit for small files.

● Chubby is a specific implementation of a distributed lock service, while

● Various data models:

● "Not Only SQL":

● Process a large number of relatively simple transactions: Usually insertions,

● Emphasize very rapid processing, with response times measured in

● Are available 24/7/365: Again, OLTP systems process huge numbers of

● E-commerce: Online shopping websites, payment processing, and order

● Data Integrity: Maintaining data accuracy and consistency.

● Redundancy: Using techniques like mirroring (e.g., two-way or three-way

● Global Namespace: Provides a single, unified view of all storage resources.