Zero Data Loss Recovery Appliance:
Best PracFces from Customer
Deployments
[CON6535]
Marco Calmasini, Sr. Principal Product Manager, Oracle
Jony Safi, ConsulFng Member of Technical. Staff, Oracle
Gagan Singh, Sr. DB Architect, Intel CorporaFon
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direcFon. It is intended for
informaFon purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or funcFonality, and should not be relied upon
in making purchasing decisions. The development, release, and Fming of any features or
funcFonality described for Oracle’s products remains at the sole discreFon of Oracle.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 3
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
4
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
5
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
TradiFonal backup soluFons
Data Loss Exposure Daily Backup Window
Lose all data since last Large performance impact on
backup producFon
Poor Database Recoverability Many Systems to Manage
Many files are copied but Scale by deploying more
protecFon state of database backup appliances
is unknown
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 6
Need a Fundamentally Different
Approach to Protect Business
CriQcal Database Data
Zero Data Loss
Recovery Appliance
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 7
Recovery Appliance Unique Benefits for Business and I.T.
Eliminate Data Loss Minimal Impact Backups
Real-Fme redo transport ProducFon databases only
provides instant protecFon send changes. All backup and
of ongoing transacFons tape processing offloaded
Zero to Sub-1s RPO Savings, Backup Time Shrinks
Database Level Recoverability Cloud-Scale ProtecQon
End-to-end reliability, visibility, Easily protect all databases in
and control of databases - not the data center using
disjoint files massively scalable service
Recovery Readiness, ValidaFons HA and Scalable Architecture
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 8
Zero Data Loss Recovery Appliance Overview
Protected Recovery Appliance
Databases
Offloads Tape
Delta Push Backup
Integrated Media Manager /
• DBs access and send only changes Third Party Backup Client SW
• Minimal impact on producFon
• Real-Fme redo transport instantly
protects ongoing transacFons
Protects all DBs in Data Center Delta Store
• Petabytes of data • Backups validated, compression, deduplicaFon
• Oracle 10.2-12c, any plagorm • Fast restores to any point-in-Fme using deltas
• No expensive DB backup agents • Built on Exadata scaling, HA and resilience
• Enterprise Manager end-to-end control Replicates to Remote
Recovery Appliance for DR
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 9
Recovery Appliance Timeline
RA X6 HW
RA X6 HW
RA X5 HW
File System Backup –
RA X4 HW Full Stack ProtecFon
Upgraded HW – Beker Performance
8 TB drives – 2X Capacity – 20X more EffecFve Storage
Recovery Appliance – Addressing Weaknesses of Backup Appliances
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 10
Recovery Appliance – Data ProtecFon Around the World
Hundreds of Systems Shipped in 2 Years Since Launch Across All Geographies
Manufacturing Education Utilities Finance Telco
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 11
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
12
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
TradiFonal Backup Architecture
Protected DB clients
• Resource consuming
agents
• Expensive licenses
• Fragmented management
Media Servers
• Expensive licenses
• Manual load distribuFon
• Frequently overloaded
• Disparate infrastructure
• Copies to tape uFlizes
overloaded media servers
Target Devices
• If disk array – no real scale-
out
• Omen lacking in HA
• Limited deduplicaFon
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 13
Comparison: Introducing Recovery Appliance
Protected DB
Clients
• No agents
• Resources fully
dedicated to business
• No licenses
• EM manages full
lifecycle
Media Servers
• Not needed
• HW can be repurposed
• Less network traffic
• Easier management
Target Devices
• Directly integrated
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 14
Pre-Built, OpFmized and Highly Available Out-of-the-Box
100%
Performance and High Availability Achievement Custom ConfiguraFon
Performance and High Availability Achievement
Measure,
diagnose Recoverability
Assemble Multi- Quality
and vendor
dozens of reconfigure
components finger
pointing
Time Time
(Days) (Months)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 15
Recovery Appliance Best PracFces
Jony Safi
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 16
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
17
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Best pracFces rule #1
• The Recovery Appliance is not an Exadata machine, it’s an APPLIANCE
• No changes needed for great performance, HA and resilience
• Follow the specific ZDLRA documentaFon and MOS notes
• Create EM alerts on new incidents, key metrics, etc.
• InformaQon Center: Overview Zero Data Loss Recovery Appliance (Doc ID 1683791.2)
• Run Exachk monthly (Oracle Exadata Database Machine exachk or HealthCheck (Doc
ID 1070954.1)).
• Subscribe to MOS alerts and refer periodically to the following notes:
– Zero Data Loss Recovery Appliance CriQcal Issues (Doc ID 1927928.1) for criQcal issues alerts
– Zero Data Loss Recovery Appliance Supported Versions (Doc ID 1927416.1) for latest soeware
updates.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 18
Backup Manageability Best PracFces
• Configure protected database
– Use Enterprise Manager Cloud Control
• Simplest deployment and configuraFon for 11g
and 12c
– Steps to backup database
• Create ProtecFon Policy on Recovery
Appliance (RA)
• Add Protected Database to RA
• Configure Backup Serngs for Protected
Database
• Schedule one-Fme Level 0 (Full) Backup with
“Custom Backup”, then Level 1s with “Oracle-
Suggested Recovery Appliance Backup”
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 19
What to do with exisFng backups?
• RMAN backups to disk or NFS share (including Data Domain share)
– Can be imported into the Recovery Appliance via “polling”
• Backups taken using 3rd party backup somware
– Leave the agent in place on the protected DB hosts unFl retenFon expires.
– Removing agents save system resources
• Interim period dual backup target MOS notes:
– ImplemenFng a Dual Backup Strategy with Backups to Disk/Tape and Recovery
Appliance (Doc ID 2154471.1 and Doc ID 2154461.1)
– Removing dual backups eliminates expensive storage, tape hardware and reduces
backup windows significantly
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 20
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
21
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Network configuraFon opFons
• 10GE and IB
– ZDLRA supports 10GigE (default) and InfiniBand. Default 10gigE provides 12 TB/hour
backup and restore rates.
– Owner’s Guide, Chapter 9 has details on how to configure Ingest over IB.
– Note: Real-Time Redo Transport will use 10GigE network ONLY.
• VLAN for security
– Backup and restore traffic from different VLANs is not routed.
– The RA supports VLAN tagging on the ingest network so protected DB hosts residing
on different and isolated VLANs can be connected directly to the RA.
– See MOS note: Enabling 8021.Q VLAN Tagging in Zero Data Loss Recovery Appliance
Over Ingest Networks (Doc ID 2047411.1)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 22
CustomizaFons
• Do not make any changes to Recovery Appliance servers (again, IT’S AN
APPLIANCE and it’s already opFmized).
– See MOS note: Consequences of modifying the Recovery Appliance (Doc ID
2172842.1) - for restricFons and supported configuraFon excepFons.
• Use MAX_RETENTION_WINDOW to enforce hard limits on data retenFon
for all databases within a protecFon policy.
– Backups are forcibly removed amer exceeding window
– Principal use case is for compliance / regulatory requirements
– Use Recovery Window Goals to manage backup space consumpFon.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 23
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
24
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Backup best pracFces
• Create Incremental L0 (full) as first backup
• Create subsequent CumulaFve L1 backups “Incremental Forever”
• Use SecFon Size for Large Datafiles (e.g. 1+ TB)
– Use 64GB as starFng point and evaluate up to (aggregate data files size / # channels)
• Virtual Full Backup CreaFon Monitoring
– Amer a L1 incremental backup the RA indexes it and builds the corresponding Virtual
Full. Check EM for error messages like “ORA-64760: Database XYZ has had tasks in
ordering wait state for over X days.”
• Refer to MOS note: DiagnosQc SQL script for tasks in ORDERING_WAIT status on Recovery
Appliance (Doc ID 2095949.1)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 25
Backup best pracFces – Cont.
• Use Transparent Data EncrypFon (TDE) instead of RMAN encrypFon
• Use naFve database compression instead of RMAN compression
• Use block change tracking for all protected databases
– $ rman target <target string> catalog <catalog
string>
backup device type sbt
cumulaFve incremental level 1
filesperset 1 secFon size 64g
database
plus archivelog not backed up
filesperset 32;
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 26
Restore and Recovery Best PracFces
What is your RTO and RPO requirements?
• When there is no validated disaster recovery plan and no Recovery
Appliance, bad things happen.
• True Story of a Restore OperaFon gone terribly wrong.
– Database Failure Occurs – must be restored from backup
• Backup was not available on disk
• Backup restored from tape
• Found some tapes had been expired by mistake – took days to re-scan and re-catalog the pieces
• Tape library had issues – moved tapes to another library that was only 1GigE connecFvity
• Tape restores were failing amer many hours – [Link] parameter was wrong
– RTO = 8 days and RPO > 8 hours!
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 27
Restore and Recovery Best PracFces – Cont.
• Use RMAN Restore Database / Recover Database as you would today
– No new RMAN commands to learn. Intelligent built-in recovery catalog in RA.
– RMAN is aware of the validated backups on disk, tape or replica. Restore operaFon is transparent and
simple.
– Restore directly from tape or RA Replica without staging data on local RA if local disk backups are not
present
• Performance consideraFons
– Maximize # of RMAN channels for Restore unless they are other acFve databases on the target.
– Restore operaFons are always prioriFzed automaFcally within RA without prevenFng other backup
operaFons
• Bigfile Tablespace PracFces and ConsideraFons (recall backup best pracFce using SECTION SIZE)
– Oracle 11g databases can restore iniFal L0 with SECTION SIZE to parallelize secFons across channels
• Restoring virtual fulls (created from L1s) does not parallelize secFons
– Oracle 12c databases can restore L0 and virtual fulls with SECTION SIZE parallelism across channels
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 28
Recovery Appliance and Data Guard
• Follow all MAA recommendaFons
– One Recovery Appliance (RA) per data center
– Backup primary and standby databases to the local RA
– No RA replicaFon for any databases with a remote standby needed
– Restore operaFon can use any RA in any locaFon
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 29
Recovery Appliance and Data Guard
MAA White Paper
• Post Data Guard role transiFon
– No change in backup operaFons. ConFnue to backup both the primary and standby
databases to the local RA
• Deploying the Zero Data Loss Recovery Appliance in a Data Guard
ConfiguraFon
– Refer to
hkp://[Link]/technetwork/database/availability/recovery-appliance-data-
[Link] or
Deploying Zero Data Loss Recovery Appliance in a Data Guard ConfiguraFon
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 30
Recovery Appliance Program Agenda
1 Business Values
2 New and Simpler World
3 Best PracFces – Manageability
4 Best PracFces – ConfiguraFon
5 Best PracFces – Backup and Recovery
6 Best PracFces – ValidaFon, Security and TroubleshooFng
7 The Intel experience – a real case scenario
31
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
ValidaFon, Security and TroubleshooFng
Top problems faced in the field
• RTO or RPO SLA’s not met
– Bad Tapes
– CorrupFons in backups
– Missing pieces (archive logs, data files or control files)
– No automaFon or end to end understanding of restore and recover process
• Problem Avoidance:
– Weekly RMAN crosschecks,
– Weekly or monthly RMAN backup or restore validate
– Monthly or Quarterly end to end restore and recovery validaFon tesFng and
automaFon
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 32
ValidaFon, Security and TroubleshooFng
How Recovery Appliance addresses the issues
• IngesFng Backups
– Validate data blocks as they are read from source database and sent to appliance
• Indexing Backups
– Blocks received are validated, compressed as they are wriken to the delta store
• Ongoing ValidaFon
– All backupsets are crosschecked daily
– All data file blocks are opFmized weekly (meaning all blocks are read weekly)
– All backupsets are validated (think restore validate) bi-weekly
– May be modified by serng of configuraFon parameter
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 33
ValidaFon, Security and TroubleshooFng
How Recovery Appliance addresses the issues (conQnued)
• Built on Exadata
– Benefits from Exadata ASM checks and auto-repair from mirrored copy
• All checks run on the Recovery Appliance, offloading addiFonal load on the protected
databases
• End-to-end recovery plan sFll needs to be tested
– Does not remove the need for periodic full restore and recovery tesFng to prepare
operaFons team and validate issues outside RA
• Monitor RA for any alerts or any database not meeFng recovery window
SLAs and address them early on.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 34
ValidaFon, Security and TroubleshooFng
Customers requiring end to end security
• Client to Recovery Appliance, or Recovery Appliance to Client
– Best EncrypFon at Rest – Database/Backups with TDE
– Security at Flight - hkps, sqlnet encrypFon & Wallets/CerFficates integraFon (stay
tuned for upcoming MAA paper)
• Security in the Recovery Appliance
– Recovery Appliance administrators responsibiliFes
• Create Virtual Private Catalog (VPC) User
• Assign protected databases to a specific VPC User
• The protected database administrator can see all databases that share a common VPC user
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 35
ValidaFon, Security and TroubleshooFng
TroubleshooQng Note
• For troubleshooFng Recovery Appliance issues refer to
– For network performance related issues between protected databases and the
Recovery Appliance refer to
– ZDLRA Detailed TroubleshooQng Methodology (Doc ID 2066528.1)
– Recovery Appliance Network Test Throughput script (Doc ID 2022086.1)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 36
Zero Data Loss Recovery Gagan Singh
Appliance: Best Practices Intel Corporation
from Customer Deployments
37
Agenda
• Legacy Backup Environment Overview
• Recovery Appliance – Implementation & learnings
• Recovery Appliance Value Summary
38
Legacy Backup Environment Overview
• Backups to tier 1 SAN storage
• Different backup strategies (Incr merged , backupsets)
• Backup retentions and validation managed manually
• I/O throttling and staggered backups
• Multiple vendor driver software and constraints
39
RA – Implementation & Learnings
• Execute test cases in pre-prod – Get Familiar !!
• Backups, Restores, Recovery(PiT/Full), ZDL, different OS
• Stay current on EM plug-ins
• Check N/W bandwidth from protected database to ZDLRA
• DB backup capacity planning – Rate of change
• Daily Incr + archivelog volume for the database(s)
• Leverage different protection policies à”Recovery Window”
is KEY
• Leverage RMAN RESTORE PREVIEW for initial validations
40
..contd
• Monitor àZero Data Loss Recovery Appliance Critical Issues
(Doc ID 1927928.1)
• Periodic Platinum Patching – Downtime required
• Enable BCT
• Troubleshooting ?
• Follow the relevant MOS note – ZDLRA detailed troubleshooting
methodology (Doc ID 2066528.1)
• ra_incident_log
• Enable EM Notifications and alerts
41
Recovery Appliance Value Summary
Goals Result
Reduce resources on production databases • Offloaded backups and validation to appliance, no more
incremental merge
Uniform Backup Environment • Single backup strategy and type of backup
Flexible Backup Retention • Leveraged Protection Policies
• Disk Retention | Recovery Windows
Reduce operational overhead • Enterprise Manager Monitoring , BI Reports , notifications.
Reliability, Availability, & Performance • Leverages Exadata-based HW
• Scalable
• Flexible to multiple database OS platforms
Backup - Better RPO and RTO • Virtual full restore eliminates incremental restore+apply time
Reduce vendor footprint • Ease of patching
• Single point of escalation
42
Summary / Q&A
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Summary
• It’s all about simplificaFon
• Simpler environment = beker stability
• Follow the best pracFces
– It’s an Appliance!
– Follow MOS notes
– Patch regularly
• Zero Data Loss Recovery Appliance: Eliminate Data ProtecFon UncertainFes [CON7405]
Donna Cooksey, Sales Enablement Lead, Oracle
• Thursday, Sep 22, 10:45 a.m. - 11:30 a.m. | Park Central - Metropolitan III
• AcceleraFng Database Backup and Recovery with Zero Data Loss RecoveryAppliance
[CON1324] – Javier Ruiz, Kevin Prendergast, George Mamvura, Energy Transfer
• Thursday, Sep 22, 12:00 p.m. - 12:45 p.m. | Park Central - Metropolitan III
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Meet us in the Showcase area outside the Keynote Hall!
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |