Huawei Cloud Database Solution Design
Huawei Cloud Database Solution Design
Foreword
⚫ This lesson describes Huawei Cloud database services and
introduces some principles for designing database solutions.
2
Objectives
⚫ Upon completion of this course, you will understand:
Huawei Cloud database services
Database solution design principles
3
Contents
1. Database Overview
3. Data Caching
4
Customer Requirements
We need to install a user management system and use a database to record user information.
We also need to collect work data of smart devices. There are a large number of data entries,
but each entry only has a small amount of data. Server high availability does not need to be
considered, but database reliability must be guaranteed.
5
Database Selection (1)
6
Database Selection (2)
We need to install a user management system and use a database to record user information. We
also need to collect work data of smart devices. There are a large number of data entries, but each
entry only has a small amount of data. Server high availability does not need to be considered, but
database reliability must be guaranteed.
7
Database Selection (3)
Unstructured data Structured data
Book Title Published Author
The Great Gatsby 1925 F. Scott Fitzgerald
Lord of the Rings 1968 J. R. R. Tolkien
EVS OBS
8
Additional Considerations for Choosing Relational
and NoSQL Databases
Item Relational Database (SQL) Non-Relational Database (NoSQL)
Storage It performs data integrity checks but has limited It only focuses on data reads and writes and rarely checks
characteristics performance. Data integrity is ensured by DB engines. data integrity. Data integrity is ensured by applications.
Transaction
Good Not supported
support
9
Evolution of Database Deployment
Database optimization Database optimization Database optimization
Patch and configuration management Patch and configuration management Patch and configuration management
OS OS OS
Power supply, cooling, and cabinets Power supply, cooling, and cabinets Power supply, cooling, and cabinets
10
Contents
1. Database Overview
3. Data Caching
11
Huawei Cloud Database Portfolio
Relational database services NoSQL database services
For traditional database workloads For fast-growing workloads
Distributed Database Database and Data Replication Data Warehouse Data Admin Database Security
Middleware Application Migration Service Service Service Service
DDM UGO DRS GaussDB(DWS) DAS DBSS
UGO
12
Relational Database Service (RDS)
RDS
Relational Database Service
13
Other Huawei Database Services
14
Building a Simple Architecture
Backups in
OBS ⚫ However, this architecture has service
availability issues.
15
High Availability
AZ AZ 1 AZ 2
Failover Failover
16
Building an HA Architecture
Backups in
OBS
17
Backup Characteristics
02
Backups are
stored in OBS
and can be Automated Binlog backups
restored to the binlog backup can be restored
customer's on- and manual to a specific
premises data backup are point in time.
center. supported.
01 03
18
Scaling Out Using Read Replicas
RDS
(Primary)
19
Discussion
20
Contents
1. Database Overview
3. Data Caching
21
NoSQL Characteristics
01 03
There are no Users can
Each data
logical locate data
record is a
relationships using key
structured
between the values rather
document.
data records in than through
a collection. complex
queries.
02
22
Document Database Service (DDS)
Flexible scaling
Visualized management
23
Application Scenarios of DDS
Single node
Cluster
Backups in OBS
25
GeminiDB
⚫ GeminiDB is a distributed, multi-model NoSQL database service with decoupled compute
and storage architecture. It is highly available, reliable, secure, and scalable and can provide
excellent performance and service capabilities like quick deployment, backup, restoration,
monitoring, and alarm reporting.
Advantages of GeminiDB
Compatibility with
Fast setup Easy O&M Network security
NoSQL APIs
⚫ It is compatible with ⚫ You can create an instance ⚫ You can create or delete ⚫ A multi-layer security system,
Cassandra, MongoDB, Redis, on the console and use an instances in just a few clicks including VPCs, subnets,
and InfluxDB APIs. ECS to access the instance on the console. Backup and security groups, and anti-
over a private network to restoration, configuring DDoS, is used to protect
reduce response time and alarms or adding nodes is just databases.
avoid the cost of using a as easy.
public network.
26
High Availability of GeminiDB
⚫ GeminiDB uses an architecture with decoupled compute and storage. Multiple DB instances
in a cluster share the underlying distributed storage. The faults that may occur include
compute node faults and storage node faults.
27
Online Scaling of GeminiDB
⚫ With cloud-native decoupled compute and storage, GeminiDB can use stateless compute
nodes, facilitating service scaling.
• Before service • When compute nodes are added, the DB • The new instance automatically constructs data
expansion instance is split into two nodes and with compaction.
virtual storage instances will be added. • The old instance reclaims junk data with
• The two nodes share data files. No data compaction or garbage collection (GC).
needs to be copied.
28
GeminiDB Mongo API
Flexible scaling
Visualized management
29
GeminiDB Mongo API: Cloud Architecture
providing a smooth
service experience.
GeminiDB Backups in
Mongo API OBS
30
Discussion
31
Contents
1. Database Overview
3. Data Caching
32
Three Challenges of Single-Node Database
Architecture
⚫ What is a single-node database architecture? The database is deployed only on one node. A single-
node database is easy to use.
⚫ Challenges:
It is hard to scale storage.
◼ Storage capacity is subject to the space of the SSD or cloud disks used by your database. Business
◼ The disk IOPS is limited. system
It is hard to scale compute resources.
◼ The database runs on one node, and the CPUs, memory, and networking
capabilities are limited by the node configuration.
Database
◼ Frequent operations on large tables cause the database performance to
deteriorate sharply, affecting services.
It is less reliable. A single-node database only
supports small and medium
◼ In the event of a fault, services can be completely interrupted. business systems.
◼ RDS primary/standby DB instances only resolve some of the issues.
33
Scaling Up
Business Business Business
system system system
Database
Database
Database
4 vCPUs | 8 GB
64 vCPUs | 512 GB
Midrange computers
34
Scaling Out
Business
Business system
system
Advantages: You can use simple devices for unlimited scalability and higher availability.
Disadvantages: Coordinated technologies are required.
Question: If the database structure is complex and database/table sharding is required, how
can we simplify database use?
35
Scaling Out Using Read Replicas
RDS
(Standby)
App Server
Read replica
ELB
App Server
Read replica
RDS
(Primary)
36
Huawei Cloud Distributed Database Middleware
(DDM)
⚫ DDM is a MySQL-compatible database middleware service that can resolve distributed scaling issues.
⚫ DDM uses Huawei Cloud RDS as the storage engine and has the full lifecycle O&M and control
capabilities, such as automatic deployment, database and table sharding, auto scaling, and it can
ensure high availability.
37
DDM Architecture
Develop Application
Manage Deploy Access
services
Instance management
Schema management
Read Write Read Write Read Write
Backup and restoration
Console Database
38
DDM Scalability
Business system Business system Business system Business system Business system
Load balancers
DDM SQL engine DDM SQL engine DDM SQL engine DDM nodes are stateless
and can be scaled up or out.
SQL parser SQL optimizer SQL parser SQL optimizer SQL parser SQL optimizer
Distributed Distributed Distributed
SQL executor SQL executor SQL executor
transactions transactions transactions
db_00 db_01 db_02 db_03 db_04 db_05 db_06 db_07 RDS instances can
be scaled up or out.
RDS for MySQL RDS for MySQL RDS for MySQL RDS for MySQL
39
Vertical Sharding
Customer benefits: support for
sustainable growth of services
Order Member
Order details shard shard
Rapid growth of core workloads | Database sharding by domain | No impacts on business systems
40
Horizontal Sharding
Single-node
database DDM
Order details
Order shard 1 Order shard 3 Member shard 1
Vertical sharding prior to horizontal sharding | Sharding based on data volume | Impacts
greatly reduced upon a fault
41
Horizontal Sharding Example
Order table_0
Sharded by user ID Order shard 0
Order table_1
Order details
Order table_2
Order shard 1
Order table_3
DDM provides 14 ...
sharding functions
Order table_8
for you to choose from. Order shard 4
Maintaining data Order table_9
aggregation is
recommended.
Log table_202001
Sharded by time range
Log table_202002
Log shard 0
System log data
Log table_202003
42
DDM Read/Write Splitting
Read/Write (strong consistency)
Customer benefits: suitable
for read-intensive services Read (weak consistency)
DDM DDM
43
DDM Architecture Design
Which scenarios
are more suitable Read replica
for DDM?
Synchronizing data
DDM
Primary instance Standby instance
Application access
Synchronizing data
Read replica
44
Contents
1. Database Overview
3. Data Caching
45
Data Replication Service (DRS)
• Data Replication Service (DRS) enables you to migrate databases to a cloud with no downtime. It supports migration between
homogeneous databases, heterogeneous databases, distributed databases, and sharded databases. DRS enables data integration and
transmission from databases to databases, data warehouses, and big data within seconds, laying a solid foundation for enterprise
data integration and digital transformation.
Minimum permission design
⚫ Java Database Connectivity (JDBC) is used to connect to the source and
User
destination databases, so you do not have to deploy programs on the
databases themselves.
⚫ Tasks run on independent and exclusively-used VMs. The data of tenants
DRS console Server Server is isolated.
⚫ The number of IP addresses is limited. Only the DRS instance IP address is
allowed to access the source and destination databases.
Reliability design
DRS management ⚫ Automatic reconnection: If the connection between DRS and your
Server Server
and control platform database breaks down due to a network disconnection or database
switchover, DRS automatically retries the connection until the task is
restored.
Data Data
extraction VM (DRS instance) replay
⚫ Resumable data transfer: If the connection between the source and the
destination fails, DRS automatically marks the log file position for replay.
Source Destination
After the fault is rectified, you can resume data transfer from where you
DB JDBC JDBC DB
left off and no data is lost.
connection DRS connection
⚫ If the VM where the DRS replication instance is located fails, workloads
are automatically switched to a new VM with the IP address unchanged to
ensure that the migration task is not interrupted.
46
DRS Functions: Real-Time Migration
⚫ For a real-time migration, DRS needs to be connected to both the source and destination DBs. The source DB,
destination DB, and migration objects must be configured, and then DRS can perform the migration automatically.
By comparing multiple items and data between source and destination, you can determine the best time for
migration to minimize downtime.
Advantages
DRS
All database Migration DRS also supports incremental migration,
objects
Empty
Public network/ VPN/ which ensures service continuity while also
Direct Connect/VPC minimizing downtime and migration impacts.
Source DB Before migration Destination DB In this way, databases can be smoothly
migrated to the cloud. Also, all database
47
Real-Time Synchronization
⚫ Data synchronization refers to the real-time flow of key service data from one source to the other
while data consistency is ensured. Real-time synchronization is different from data migration.
Migration means moving your whole database from one platform to another. Synchronization refers to
the continuous flow of data between different services.
Typical scenarios: real-time analysis, report systems, and data warehouse environments
DB A Advantages
Financial system DRS can meet various synchronization
requirements, such as many-to-one,
DB C one-to-many synchronization,
Data is continuously synchronized
DB A DB C
between systems.
dynamic addition and deletion of
DB B DB A
HR system synchronization tables, and
synchronization between tables with
DB A
different names.
Order system
48
Database and Application Migration UGO (UGO)
⚫ Database and Application Migration UGO (UGO) is for heterogeneous database schema migration and
application SQL conversion. UGO provides database evaluation, object migration, and automatic syntax
conversion. These functions make database migration easier and less expensive.
Database migration Huawei Cloud
tools databases
UGO DRS
Heterogeneous
(Schema (Data migration
database Relational
migration) and verification)
Database Service
Evaluates and (RDS)
converts
Data migration
heterogeneous
database syntax
Homogeneous DRS
GaussDB
database (Data migration and verification)
Log-based
Log-based data
incremental data
changes captured
verification in real
in real time
time
49
Contents
1. Database Overview
3. Data Caching
50
Data Differences
GeminiDB OBS
Redis API RDS OBS GaussDB(DWS) Deep Archive
Hot data being processed Data that will probably Historical data
1. Context of the customers who be accessed soon 1. Information about inactive
are performing operations 1. Information about all customers
2. Information about the most normal customers 2. Information about the
popular products 2. Information about products that are no longer
3. Latest patches products being sold available
4. Data that has just been 3. Historical patches 3. Patches that have been
accessed 4. Data generated in the eliminated
last month 4. Data that is rarely accessed
51
Data Governance, Access Frequencies, and Lifecycle
GeminiDB OBS
Redis API RDS OBS GaussDB(DWS) Deep Archive
52
Question
53
Application Scenarios of Caches
Data can be
cached on
mobile phones. A cache is required
for session
management.
54
Content Delivery Network (CDN) Overview
55
CDN Working Principle
OBS
Customer CDN
mobile phone
56
Static Content and Dynamic Content
⚫ What information here is dynamic and what information is static?
57
CDN Content Sources
Customer CDN
mobile phone
html, csv ... TTL = ?
On-premises data center
ECS
Dynamic content
58
Application Scenarios of Caches
59
Application Scenarios of Caches
GeminiDB
Redis API
60
Why Do We Need a Cache Database?
61
Distributed Cache Service (DCS)
62
GeminiDB Redis API
It uses a cloud-native, distributed architecture and
architecture.
63
Architecture Comparison
Open-source Redis
• Open-source Redis data is scattered in the memory of
independent nodes. If a pair of master and slave nodes
become faulty, some data will be lost.
• Half of the nodes in a Redis cluster are slave nodes, and
they are read-only.
• Data is replicated asynchronously between master and
slave nodes, so if you access master and slave nodes at
the same time, you may obtain inconsistent data.
• The efficiency of the gossip protocol decreases
Huawei Cloud GeminiDB Redis API significantly as the scale of the Redis cluster increases.
• Scaling the capacity of a Redis instance means increasing
• All data is flushed to disks. Three copies of data are stored in its physical nodes. This has a great impact on services.
the distributed shared storage pool, ensuring zero data loss.
• All compute nodes are writable.
• Three copies of data deliver strong consistency, and there
are no dirty reads even when multiple nodes are handling
service requests at the same time.
• Large-scale clusters can be managed. If a cluster node
becomes faulty, other nodes of the cluster can take over
services in seconds. All requests to the cluster are
dynamically distributed across all cluster nodes based on
your configurations.
• Storage and compute resources are decoupled, so they can
be scaled flexibly and without any service interruptions.
64
Write-Through
Advantages
• Data is written to the cache based
on a simple logic.
Reads data only 2 • Data can be read quickly.
Redis cache
from the cache.
• Data can be kept consistent.
1 Writes data to the
cache and database.
Disadvantages
1
• If the cache is damaged, rebuilding
Application RDS database
it is difficult.
• Every write involves a penalty.
• The cache overhead is high.
65
Lazy Loading
Advantages
Access the cache first,
1
and data is found.
• There is no requirement on
cache reliability.
3 Redis cache
• Only required data is cached.
1
4 Disadvantages
2
Application RDS database • A cache miss causes a
2 In the event of a cache miss, performance penalty.
read data from the database
3 and then write it to the cache. • Dirty data may exist.
4 Data is directly written to the
database.
66
Discussion
67
Dealing With Dirty Data and Data Inconsistency
⚫ Don't worry!
⚫ Configure an appropriate time to live (TTL) to minimize the
possibility of dirty data.
⚫ For the service logic, a certain amount of latency is acceptable.
⚫ The application logic can tolerate dirty data. It does not cause
exceptions.
⚫ Even a one-second cache TTL can impact costs and performance.
68
Contents
1. Database Overview
3. Data Caching
69
Solution Design Principles: Data
Architecture Design
71
Storage Reliability
1. Choose a cluster or primary/standby instance as required.
• Note the differences between backup and replication. Sometimes, continuous replication can cause problems.
72
Data Reliability
AZ Region
AZ
Application Application
SQL engine
EVS
Database OBS
73
Data Performance
1. Monitoring is a key to tracking service performance.
3. When planning the network architecture, specify how hot and cold data should be handled.
• The performance requirements of cold data should be different from those of hot data.
74
Data Costs
1. Evaluate the unit price and performance of different 3. Stop
2. Database
engines. unnecessary
performance
read replicas.
2. Shutdown: Stop unnecessary read replicas to shrink the
size of a database cluster.
3. Purchase: Choose pay-per-use or yearly billing mode
appropriately.
1. Unit 4. Shrink a
4. High prices do not mean high costs: Consider how to prices of database
make monthly reports. engines cluster.
5. Optimize the architecture.
• Add caches.
• Adjust the data design.
6. Monitoring: Improve resource utilization.
5. Billing
6. Monitoring
modes
76
Contents
1. Database Overview
3. Data Caching
77
Discussion: How to Design a Cloud Database for
Promotions
⚫ There are a huge number of concurrent requests during flash sales, so the database must be designed
with enough performance to handle such concurrent reads and writes while ensuring data consistency
and reliability.
⚫ Assume that you are an architect of an e-commerce platform and the following requirements need to
be met.
Customers can purchase products during flash sales, but each customer can purchase a given product only once, because the
inventory is limited.
The platform must support a large number of concurrent operations and ensure data consistency and reliability.
During flash sales, customer requests should be responded to in a timely manner and transactions need to be completed fast.
After the promotion ends, the platform should be able to produce data summaries and allow users to analyze, export, and
browse data.
78
Quiz
1. (Single-choice question) When you design a database solution using Huawei
Cloud services, which of the following services is recommended for reducing
the impact of high-concurrency workloads on databases in a single-service
scenario?
A. GaussDB
B. DCS
C. RDS
D. EVS
79
Summary
⚫ This lesson described how to design and deploy a database on Huawei Cloud.
It also covered five principles for reviewing database solutions and we hope
to have improved your understanding of database architecture design
through these discussions.
80
Acronyms and Abbreviations
⚫ AZ: availability zone
⚫ ECS: Elastic Cloud Server
⚫ EVS: Elastic Volume Service
⚫ OBS: Object Storage Service
⚫ SFS: Scalable File Service
⚫ EIP: Elastic IP
⚫ VPC: Virtual Private Cloud
81
Thank You.
Copyright © 2024 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including,
without limitation, statements regarding the future financial and operating results,
future product portfolio, new technology, etc. There are a number of factors that
could cause actual results and developments to differ materially from those
expressed or implied in the predictive statements. Therefore, such information is
provided for reference purpose only and constitutes neither an offer nor an
acceptance. Huawei may change the information at any time without notice.