High
Availability
Oracle Database
PT. Massindo International
Project
Background
Background Requirement
Massindo Needs to build High Availability
system for their Production database to
maintain availability data and prevent a loss
caused by outages.
Key Objective Proposal
Build a High Availability Architectures and
Solutions on the Production Database
Current Topology
Physical Logical
LAN Switch
1G LAN Connection
1G
P2000
1G 1G 1G 1G 1G 1G 1G
DB
Deploy
WEB03
Standby
WEB2
WEB
APP
1G ble
C7000 Ca
FC
8G
Dell Storage
FC Ca 8G 8G 8G 8G 8G 8G 8G
ble
8G
8G SAN FC Connection
FC
Ca
bl
8G
1/8 Autoloader
P2000 Dell 1/8 Autoloader
Storage LTO5
DB servers
Current Condition
• There is no HA system installed on DB
Production Server
• Protection data only rely on Database Backup
using RMAN
Proposed Solution
BUILD HIGH AVAILABILITY PROVIDE PREVENTIVE PROVIDE FAILOVER TESTING PER-3
ARCHITECTURE FOR DATABASE MAINTENANCE OF HA QUARTERLY. MONTH.
SERVER
Build High • Main Objective :
Availability - Deliver Detail Design and Architecture of High
Architecture Availability on Oracle Database Production
- Implementing High Availability on Database
Production Server
• Pre-Implementation task:
- Assessment current System
- Create Bill of Material
• Execution Method :
- Remote Support handling
- Onsite support when remote is not possible to
do, and after office hour or weekend
Model Architecture of HA
on Oracle Database
Normal Condition Failed Node Condition
Apps Apps
Option 1 Oracle Net Service Client Access Oracle Net Service Client Access
Oracle Clusterware Node 1 Node 2 Node 1 Node 2
Primary Node
Failover Node
Primary Node
Failover Node
(Cold Cluster Heartbeat Heartbeat
Failover) Active Standby
Database
Database
Shared Storage Shared Storage
Normal Condition
Oracle Broker
Site A Site B
Oracle Oracle
Instance Instance
Primary Standby
Database Database
Database
Database
Option 2
Oracle Data Guard Failed Node Condition (Fast Start Failover)
Oracle Broker Oracle Broker
Site A Site B Site A Site B
Oracle Oracle
Instance Instance
Standby Primary
Database Database
Database
Database
Database
Database
Apps Apps
Option 3
WAN Traffic
Manager
Oracle Clusterware Oracle Net Service Client Access Oracle Net Service Client Access
and Data Guard Node 1 Node 2 Node 1 Node 2
Primary Node
Primary Node
Failover Node
Failover Node
Heartbeat Dedicated Network Heartbeat
Active Standby Active Standby
Oracle Data Guard
Production DB Standby DB
Architectures based on Business
Requirements
This table summarizes the advantages of the different high availability architectures and provides guidelines for you to choose the correct high
availability architecture for your business plus its allotted budget determine the appropriate architecture. The key factors include:
• Recovery time objective (RTO) and recovery point objective (RPO) for unplanned outages and planned maintenance
• Management overhead (MO)
• Total cost of ownership (TCO) and return on investment (ROI)
Consider Using Business or Application Impact
Oracle Database with Oracle • Maximum RTO for instance or node failure is in minutes.
Clusterware (Cold Cluster Failover) • MO is low.
• ROI is low.
• RPO is zero.
• Rolling upgrade and patch capabilities for Oracle Clusterware with zero database downtime.
Oracle Database with Oracle Data • Maximum RTO for instance or node failure is in seconds to minutes.
Guard • Maximum RTO for data corruptions, database, or site failures is in seconds to minutes.
• Choice of RPO equal to zero (SYNC) or near-zero (ASYNC).
• MO is low.
• ROI is high.
• Rolling upgrade for system, clusterware, database, and operating system
• Off-load read-only, reporting, testing and backup activities to the standby database.
• Limited support for mixed platforms.
Oracle Database with Oracle • All of the business benefits of Oracle Clusterware (cold cluster failover) and Oracle Data Guard
Clusterware and Oracle Data Guard • MO is low.
• ROI is medium.
• RPO is zero for cluster failover, choice of RPO equal to zero for database failover (Data Guard SYNC), or near-zero (Data
Guard ASYNC).
Benefit of Each HA Options
HA Options Business or Application Impact
Oracle Database with Oracle • Automatic recovery of node and instance failures in minutes
Clusterware (Cold Cluster Failover) • Ability to customize the failure detection mechanism (For example, you can use your favorite application query in the
(OPTION 1) database check action. Providing application-specific failure detection means Oracle Clusterware can fail over not only
during the obvious cases such as when the instance is down, but also in the cases when, for example, an application
query is not meeting a particular service level.)
• High availability functionality to manage third-party applications
• Rolling release upgrades of Oracle Clusterware
Oracle Database with Oracle Data • Fast, automatic or automated database failover for data corruptions, lost writes, and database and site failures
Guard (OPTION 2) • Automatic corruption repair automatically replaces a corrupted block on the primary or physical standby by copying a
good block from a physical standby or primary database
• Reduced downtime with Oracle Data Guard rolling upgrade capabilities
• Ability to off-load primary database activities—such as backups, queries, or reporting—without sacrificing the RTO and
RPO ability to use the standby database as a read-only resource using the real-time query apply lag capability
• No need for instance restart, storage remastering, or application reconnections after site failures
• Transparency to applications
• Transparent and integrated support for application failover
• Effective network utilization
Oracle Database with Oracle • All of the benefit option 1 and option 2
Clusterware and Oracle Data Guard
(OPTION 3)
Maintenance • Main Objective :
HA - Prevent failed to when doing Fast Failover.
(Preventive) - Analyze and check for HA System Health.
- Troubleshooting for error on HA system during
maintenance activity
• Activity Period :
- Quarterly in a year
• Execution Method :
- Remote Support handling (prior to on site)
- On Site support when remote is not possible to do,
and after office hour or weekend
• Main Objective : • Activity Period :
- To prove - Frequently per-3 months
reliability and
availability of
Data
- To prevent failed
failover when
Testing Disaster occurred
Restore
• Execution Method :
- Remote Support handling (prior to on site)
- On Site support when remote is not possible
to do, and after office hour or weekend
Provide Support for Install and configure
HA System on Database Server
Provide Support for Oracle Clusterware
and Data Guard
Scope Of
Work
Provide Preventive Maintenance
Provide Failover Testing