Database Recovery
Techniques
Presented by Siddhartha Shrestha
Database Recovery Fundamentals
Recovery restores the database to its most recent, correct state after failure. This requires tracking all data changes in the
system log.
Two primary strategies address different failure types:
Catastrophic Failure Non-Catastrophic Failure
Extensive damage, like a disk crash, requires restoring from Less severe issues, like a transaction crash, focus on
a past backup copy. correcting inconsistencies from failed transactions.
Data Recovery: Caching & Logging
Data Caching (Buffering) 2. Write-Ahead Logging (WAL)
To speed things up, disk pages (blocks) containing This is a vital safety protocol designed to prevent data
data to be updated are first copied into main loss if the computer's memory (RAM buffer) or the log
memory buffers itself is lost during a failure
The DBMS cache is a collection of these in-memory
buffers, managed by the database system WAL ensures that crucial recovery information is safely
written to disk before data changes are finalised
A dirty bit is attached to each buffer: it's 0 if the buffer
hasn't been changed, and 1 if it has been modified
Write-Ahead Logging: Rules & Efficiency
These rules dictate when a modified data page from the database's memory cache can be written back to the physical disk
• You cannot replace an old data value with a new one on disk until the undo-type logging record for that change has been
permanently saved on disk.
• Before a transaction is declared "committed" (its operations are permanent), both the redo and undo parts of its log
entries must be permanently written to disk.
• To make recovery faster, the DBMS keeps a list of transaction details, including active (uncommitted) transactions and all
committed/aborted transactions up to the last checkpoint.
Control Over Writing Data to Disk (Steal/No-Steal,
Force/No-Force)
These rules dictate when a modified data page from the database's memory cache can be written back to the physical disk
1 No-steal approach: 2 Steal approach:
A buffer page modified by a transaction cannot be The recovery system allows writing an updated buffer
written to disk before that transaction officially to disk even before the transaction commits
commits
3 Force approach: 4 No-force approach:
All pages updated by a transaction must be The recovery system allows a transaction to commit
immediately written to disk before the even if its updated buffers have not yet been
transaction commits written to disk
Checkpoints – Creating Recovery "Bookmarks"
• Checkpoints are special entries in the system log.
• They act like a "bookmark" or "safe point", declaring that the DBMS was in a consistent state and all transactions before
this point were committed.(Think of it like saving your progress in a video game.)
• Checkpoints help reduce the amount of log data needed during recovery by ensuring older logs are safely stored and
cleared from memory.
Taking a Checkpoint Involves:
1.Temporarily stopping transaction execution
1.Forcing all modified memory buffers to be written to disk
1.Writing a [Checkpoint] record to the log and forcing the log to be written to disk
1.Resuming normal transaction execution
• Fuzzy Checkpointing is a technique used to avoid delays by allowing transactions to continue while buffers are
gradually written to disk, improving performance.
Undoing Transactions and Cascading Problems
Transaction Rollback:
• If a transaction fails before it's permanently saved (committed), any changes it made must be reversed.
• This means restoring the data on disk to its exact state before the transaction began (this is called the Before Image).
• Special undo log entries are used to make these restorations happen.
Cascading Rollbacks:
• When one transaction (like T) needs to be undone, it can sometimes force other transactions (like S) that used data written
by T to also be undone.
• This can create a chain reaction, where undoing one transaction leads to many others being undone in sequence.
• This problem occurs in recovery systems that allow "recoverable schedules" (meaning committed transactions only read
committed data) but do not prevent these "cascading" undoes with "strict" or "cascadeless" schedules.
NO-UNDO/REDO Recovery (Deferred Update Strategy)
This recovery strategy, known as the Deferred Update Strategy, simplifies transaction rollback by delaying the writing of data changes to disk until
after a transaction successfully commits. All modifications are held in memory until commitment. Only once the transaction commits and its log record is
saved are the updated data pages applied to the database. As a result, the recovery log primarily contains REDO operations using After Image (AFIM)
entries, and UNDO entries are not required because the original disk data remains unchanged until the commit, eliminating the need to undo partial
changes.
Key Benefits
No Undo Needed on Transaction Failure Prevention of Cascading Rollbacks
If a transaction fails before committing, **no undo operation is This approach ensures transactions **never read data from
needed**. Since changes are only written to disk upon successful uncommitted transactions**. Only committed data makes it to the
commit, the database on disk remains in its original, consistent stable database, preventing one transaction from depending on
state. This simplifies recovery and speeds up rollback. another's unfinished work. This completely eliminates cascading
rollbacks, enhancing system stability.
Potential Drawbacks
Limited Concurrency Increased Buffer Space Requirement
A drawback is that it can limit how many transactions run at once. This strategy may require more memory (buffer space) to hold all
Data items modified by a transaction must stay write-locked until updated data items until transactions commit. Since changes aren't
that transaction commits. This strict locking can reduce parallel immediately written to disk, modified pages accumulate in memory.
execution, potentially impacting system throughput. This can lead to memory pressure for systems with many or large
transactions.
Recovery Using Immediate Update (Part 1: UNDO/NO-
REDO)
With Immediate Update, a transaction's modifications are written directly to the database on disk as they happen, without
waiting for a successful commit. This "early write" approach necessitates specific recovery strategies.
Immediate Persistence Undo Operations Critical No Redo Needed
All transaction updates are recorded Since changes are on disk early, a Once committed, transaction
to disk immediately, even before mechanism is essential to undo changes are guaranteed to be on
the transaction has committed. effects of any transaction that fails disk, removing the need for redo
before committing. operations during recovery.
Recovery Using Immediate Update (Part 2: UNDO/REDO
Scenarios)
UNDO/NO-REDO recovery (continued)
• Uses a steal/force strategy:
⚬ Can write updated buffers to disk before commit (steal).
⚬ Must write all updates to disk before commit (force).
• Guarantees all updates are on disk at commit, so no REDO needed.
UNDO/REDO recovery (most general case)
• Uses a steal/no-force strategy:
⚬ Transactions can commit before all changes reach disk.
• If a transaction fails, UNDO partial changes.
• If a transaction commits but changes are not on disk, REDO those changes during recovery.
Shadow Paging – A "Copy-on-Write" Approach
• Shadow Paging is a recovery method that considers the database to be made up of a number of fixed-size disk blocks.
• It uses two directories:◦ A current directory points to the most recent or current database pages on disk.◦ A shadow
directory is an unmodified copy of the current directory at the start of a transaction, representing the "safe" state. The
shadow directory is saved on disk while the current directory is used by the transaction.
• When a transaction modifies a page, a new copy of the modified database page is created in a new, unused disk block. The
old copy of that page is not overwritten.
• This means two versions of updated pages exist: the old one (referenced by the shadow directory) and the new one
(referenced by the current directory).
• If the transaction fails, the current directory is simply discarded, and the shadow directory remains the valid, consistent
database state. If it commits, the current directory becomes the new shadow directory.
• Shadow paging avoids the need for undo/redo logs as the system can simply discard the current directory to revert to a
consistent state if something goes wrong.
Example of shadow paging:
Database Backup & Recovery from Major Disasters
• The recovery manager of a DBMS is equipped to handle severe, catastrophic failures like disk crashes.
• The primary method for this is database backups. This involves regularly copying the entire database and its log onto
affordable, large-capacity storage (like magnetic tapes or other offline devices).
• If a catastrophic failure occurs, the latest backup copy can be reloaded from the storage medium back onto the disks, and
the system can be restarted.
• For critical applications (e.g., banking, insurance), databases are often periodically backed up in their entirety and stored in
physically separate, secure locations.
• To minimise data loss between full database backups, it's customary to back up the system log more frequently by copying
it to magnetic tape.
Thank You! Any
Questions?