0% found this document useful (0 votes)

831 views12 pages

Log Based Recovery in DBMS

Q: How do deferred and immediate modification techniques impact transaction atomicity in DBMS?

Deferred modification techniques enhance transaction atomicity by ensuring that no database changes are made until a transaction is committed, thus guaranteeing that all operations are completed as a unit. In contrast, immediate modification techniques allow database changes during transaction execution, requiring more complex log records to roll back uncommitted changes if a transaction aborts. While immediate modifications increase concurrency, they also present a higher risk of violating atomicity if not properly managed .

Q: What is the main purpose of log-based recovery in database management systems?

The main purpose of log-based recovery is to maintain the atomicity and durability of transactions in a database management system. It uses a transaction log to record all modifications in the database, ensuring that changes made by completed transactions are not lost and those made by incomplete transactions can be undone in the event of a system failure. This ensures that after a failure, the database can be restored to a consistent state .

Q: Explain the significance of transaction identifiers in log records for a database recovery process.

Transaction identifiers in log records are crucial for tracking and managing the progress of individual transactions during the database recovery process. They allow the DBMS to link related log entries specifically to the correct transaction, facilitating precise undo and redo of operations. Identifiers ensure that modifications are correctly associated with their respective transactions, aiding in the efficient and error-free recovery of the database state post-failure .

Q: How does a DBMS use log-based recovery to handle system crashes and ensure data consistency?

Upon encountering a system crash, the DBMS uses log-based recovery by examining the transaction log to determine which transactions need to be undone or redone based on their commit statuses. Transactions marked with a start but lacking a commit or abort must be undone to remove any partial modifications, while those with a commit need their changes redone if not yet applied to the database. Checkpoints further streamline this process by restricting log examination to transactions active since the last checkpoint, thereby ensuring both efficiency and consistency .

Q: In what way do checkpoints function as synchronization points in DBMS?

Checkpoints serve as synchronization points by marking a consistent state in the database and ensuring that all changes up to that point are recorded in the database rather than just in memory. When a checkpoint is initiated, all logs are flushed to persistent storage, and the state of active transactions is captured. This process reduces the amount of work needed during recovery by ensuring that any changes made by committed transactions are not lost, essentially synchronizing the database and transaction log .

Q: Why are checkpoints crucial in enhancing the recovery process in DBMS?

Checkpoints are crucial in the recovery process as they reduce the time and resources needed to recover the system after a crash. By writing a checkpoint log, the DBMS can limit the range of logs it must check to determine redo and undo actions only to those transactions active since the last checkpoint. This minimizes unnecessary operations such as redoing changes already written to the database, making the recovery process more efficient .

Q: Discuss the advantages and limitations of using log-based recovery in databases.

Log-based recovery offers several advantages, including ensuring durability by providing a reliable data recovery method and facilitating faster recovery through replaying committed transactions. It also supports incremental backups, lowering corruption risks. However, it incurs additional overhead and complexity, requiring careful management to avoid data inconsistencies. The storage demands of maintaining comprehensive logs can also be significant, and replaying a large number of logs can be time-consuming .

Q: How does the concept of undo and redo operations work with transaction logging in a DBMS?

In transaction logging, every database modification is preceded by a log record, which contains both the old and new values of the modified data item. The undo operation uses log records to revert data items to their old values in case of aborted transactions, ensuring those changes are not persisted in the database. Conversely, the redo operation uses log records to apply new values to data items for committed transactions, ensuring those modifications are reflected in the database. These operations allow the system to maintain consistency and recover from failures .

Q: What are the types of modification techniques used in database modification, and how do they differ?

The two types of modification techniques used in databases are deferred and immediate modification techniques. In the deferred modification technique, a transaction does not modify the database until it has partially committed, thus changes are recorded in a log and applied only after commit. This ensures that the database is modified only if the transaction commits. In contrast, the immediate modification technique allows the database to be modified while the transaction is still active, with changes being made directly to the database before commit. This requires careful log management to undo changes from uncommitted transactions upon failure .

Q: What role do log records play in enabling the redo and undo operations during database recovery?

Log records are fundamental to enabling redo and undo operations during database recovery. Each log record captures detailed information about a transaction, including the data item, old value, and new value. This detailed account allows the DBMS to accurately revert to prior states or reapply committed changes. Undo operations use log records to restore data items to their pre-transaction state if the transaction is not successfully committed, whereas redo operations use them to ensure committed transaction changes are reflected in the database, thereby maintaining consistency .

Log-based recovery in DBMS ensures atomicity by maintaining logs of all transactions, allowing the system to undo or redo operations based on transaction states during a failure. The recovery process involves using log records to determine which transactions need to be undone or redone, with checkpoints implemented to streamline recovery time. This method provides durability and faster recovery, making it essential for maintaining database consistency after crashes.

Uploaded by

30686it

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

831 views12 pages

Log Based Recovery in DBMS

Uploaded by

30686it

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

‭Log based Recovery in DBMS‬

‭●‬
‭●‬
‭●‬
‭ he‬ ‭atomicity‬ ‭property‬ ‭of‬ ‭DBMS‬ ‭states‬ ‭that‬ ‭either‬ ‭all‬ ‭the‬ ‭operations‬ ‭of‬
T
‭transactions‬ ‭must‬ ‭be‬ ‭performed‬ ‭or‬ ‭none.‬ ‭The‬ ‭modifications‬ ‭done‬ ‭by‬ ‭an‬
‭aborted‬ ‭transaction‬ ‭should‬ ‭not‬ ‭be‬ ‭visible‬ ‭to‬ ‭the‬ ‭database‬ ‭and‬ ‭the‬
‭modifications‬‭done‬‭by‬‭the‬‭committed‬‭transaction‬‭should‬‭be‬‭visible.‬‭To‬‭achieve‬
‭our‬ ‭goal‬ ‭of‬ ‭atomicity,‬ ‭the‬ ‭user‬ ‭must‬ ‭first‬ ‭output‬ ‭stable‬ ‭storage‬ ‭information‬
‭describing‬ ‭the‬ ‭modifications,‬ ‭without‬ ‭modifying‬ ‭the‬ ‭database‬ ‭itself.‬ ‭This‬
‭information‬ ‭can‬‭help‬‭us‬‭ensure‬‭that‬‭all‬‭modifications‬‭performed‬‭by‬‭committed‬
‭transactions‬ ‭are‬ ‭reflected‬ ‭in‬ ‭the‬ ‭database.‬ ‭This‬ ‭information‬ ‭can‬ ‭also‬‭help‬‭us‬
‭ensure‬ ‭that‬ ‭no‬ ‭modifications‬ ‭made‬ ‭by‬ ‭an‬ ‭aborted‬ ‭transaction‬ ‭persist‬ ‭in‬ ‭the‬
‭database.‬

‭Log based Recovery in DBMS‬

‭Log and log records‬

‭ he‬‭log‬‭is‬‭a‬‭sequence‬‭of‬‭log‬‭records,‬‭recording‬‭all‬‭the‬‭updated‬‭activities‬‭in‬‭the‬
T
‭database.‬ ‭In‬ ‭stable‬ ‭storage,‬ ‭logs‬ ‭for‬ ‭each‬ ‭transaction‬ ‭are‬ ‭maintained.‬ ‭Any‬
‭operation‬‭which‬‭is‬‭performed‬‭on‬‭the‬‭database‬‭is‬‭recorded‬‭on‬‭the‬‭log.‬‭Prior‬‭to‬
‭performing‬‭any‬‭modification‬‭to‬‭the‬‭database,‬‭an‬‭updated‬‭log‬‭record‬‭is‬‭created‬
‭to‬ ‭reflect‬ ‭that‬ ‭modification.‬ ‭An‬ ‭update‬ ‭log‬ ‭record‬‭represented‬‭as:‬‭<Ti,‬‭Xj,‬‭V1,‬
‭V2> has these fields:‬
‭1.‬ ‭Transaction identifier:‬‭Unique Identifier of the transaction that‬
‭performed the write operation.‬
‭2.‬ ‭Data item:‬‭Unique identifier of the data item written.‬
‭3.‬ ‭Old value:‬‭Value of data item prior to write.‬
‭4.‬ ‭New value:‬‭Value of data item after write operation.‬
‭Other types of log records are:‬
‭1.‬ ‭<Ti start>‬‭: It contains information about when a transaction‬‭Ti starts.‬
‭2.‬ ‭<Ti commit>‬‭: It contains information about when a‬‭transaction Ti‬
‭commits.‬
‭3.‬ ‭<Ti abort>‬‭: It contains information about when a transaction‬‭Ti‬
‭aborts.‬
‭Undo and Redo Operations‬
‭ ecause‬‭all‬‭database‬‭modifications‬‭must‬‭be‬‭preceded‬‭by‬‭the‬‭creation‬‭of‬‭a‬‭log‬
B
‭record,‬‭the‬‭system‬‭has‬‭available‬‭both‬‭the‬‭old‬‭value‬‭prior‬‭to‬‭the‬‭modification‬‭of‬
‭the‬ ‭data‬ ‭item‬ ‭and‬ ‭new‬ ‭value‬ ‭that‬ ‭is‬ ‭to‬ ‭be‬ ‭written‬ ‭for‬ ‭data‬ ‭item.‬ ‭This‬ ‭allows‬
‭system to perform redo and undo operations as appropriate:‬
‭1.‬ ‭Undo:‬‭using a log record sets the data item specified‬‭in log record to‬
‭old value.‬
‭2.‬ ‭Redo:‬‭using a log record sets the data item specified‬‭in log record to‬
‭new value.‬
‭The database can be modified using two approaches –‬
‭1.‬ ‭Deferred Modification Technique:‬‭If the transaction‬‭does not‬
‭modify the database until it has partially committed, it is said to use‬
‭deferred modification technique.‬
‭2.‬ ‭Immediate Modification Technique:‬‭If database modification‬‭occur‬
‭while the transaction is still active, it is said to use immediate‬
‭modification technique.‬
‭Recovery using Log records‬
‭ fter‬ ‭a‬ ‭system‬ ‭crash‬ ‭has‬ ‭occurred,‬ ‭the‬‭system‬‭consults‬‭the‬‭log‬‭to‬‭determine‬
A
‭which transactions need to be redone and which need to be undone.‬
‭1.‬ ‭Transaction Ti needs to be undone if the log contains the record <Ti‬
‭start> but does not contain either the record <Ti commit> or the‬
‭record <Ti abort>.‬
‭2.‬ ‭Transaction Ti needs to be redone if log contains record <Ti start>‬
‭and either the record <Ti commit> or the record <Ti abort>.‬
‭Use of Checkpoints –‬‭When a system crash occurs, user‬‭must consult the‬
‭log. In principle, that need to search the entire log to determine this‬
‭information. There are two major difficulties with this approach:‬
‭1.‬ ‭The search process is time-consuming.‬
‭2.‬ ‭Most of the transactions that, according to our algorithm, need to be‬
‭redone have already written their updates into the database.‬
‭Although redoing them will cause no harm, it will cause recovery to‬
‭take longer.‬
‭To‬ ‭reduce‬ ‭these‬ ‭types‬ ‭of‬ ‭overhead,‬‭user‬‭introduce‬‭checkpoints.‬‭A‬‭log‬‭record‬
‭of‬‭the‬‭form‬‭<checkpoint‬‭L>‬‭is‬‭used‬‭to‬‭represent‬‭a‬‭checkpoint‬‭in‬‭log‬‭where‬‭L‬‭is‬
‭ ‬ ‭list‬ ‭of‬ ‭transactions‬ ‭active‬ ‭at‬ ‭the‬‭time‬‭of‬‭the‬‭checkpoint.‬‭When‬‭a‬‭checkpoint‬
a
‭log‬‭record‬‭is‬‭added‬‭to‬‭log‬‭all‬‭the‬‭transactions‬‭that‬‭have‬‭committed‬‭before‬‭this‬
‭checkpoint‬ ‭have‬ ‭<Ti‬ ‭commit>‬ ‭log‬ ‭record‬ ‭before‬ ‭the‬ ‭checkpoint‬ ‭record.‬ ‭Any‬
‭database‬‭modifications‬‭made‬‭by‬‭Ti‬‭is‬‭written‬‭to‬‭the‬‭database‬‭either‬‭prior‬‭to‬‭the‬
‭checkpoint‬ ‭or‬ ‭as‬ ‭part‬ ‭of‬‭the‬‭checkpoint‬‭itself.‬‭Thus,‬‭at‬‭recovery‬‭time,‬‭there‬‭is‬
‭no‬‭need‬‭to‬‭perform‬‭a‬‭redo‬‭operation‬‭on‬‭Ti.‬‭After‬‭a‬‭system‬‭crash‬‭has‬‭occurred,‬
‭the‬ ‭system‬‭examines‬‭the‬‭log‬‭to‬‭find‬‭the‬‭last‬‭<checkpoint‬‭L>‬‭record.‬‭The‬‭redo‬
‭or‬ ‭undo‬ ‭operations‬ ‭need‬ ‭to‬ ‭be‬ ‭applied‬ ‭only‬ ‭to‬ ‭transactions‬ ‭in‬ ‭L,‬ ‭and‬ ‭to‬ ‭all‬
‭transactions‬ ‭that‬ ‭started‬‭execution‬‭after‬‭the‬‭record‬‭was‬‭written‬‭to‬‭the‬‭log.‬‭Let‬
‭us‬ ‭denote‬ ‭this‬ ‭set‬ ‭of‬ ‭transactions‬ ‭as‬ ‭T.‬ ‭Same‬ ‭rules‬ ‭of‬ ‭undo‬ ‭and‬ ‭redo‬ ‭are‬
‭applicable‬ ‭on‬ ‭T‬ ‭as‬ ‭mentioned‬ ‭in‬ ‭Recovery‬ ‭using‬ ‭Log‬ ‭records‬‭part.‬‭Note‬‭that‬
‭user‬‭need‬‭to‬‭only‬‭examine‬‭the‬‭part‬‭of‬‭the‬‭log‬‭starting‬‭with‬‭the‬‭last‬‭checkpoint‬
‭log‬‭record‬‭to‬‭find‬‭the‬‭set‬‭of‬‭transactions‬‭T,‬‭and‬‭to‬‭find‬‭out‬‭whether‬‭a‬‭commit‬‭or‬
‭abort‬‭record‬‭occurs‬‭in‬‭the‬‭log‬‭for‬‭each‬‭transaction‬‭in‬‭T.‬‭For‬‭example,‬‭consider‬
‭the‬ ‭set‬ ‭of‬ ‭transactions‬ ‭{T0,‬ ‭T1,‬ ‭.‬ ‭.‬ ‭.,‬ ‭T100}.‬ ‭Suppose‬ ‭that‬ ‭the‬ ‭most‬ ‭recent‬
‭checkpoint‬ ‭took‬ ‭place‬‭during‬‭the‬‭execution‬‭of‬‭transaction‬‭T67‬‭and‬‭T69,‬‭while‬
‭T68‬ ‭and‬ ‭all‬ ‭transactions‬ ‭with‬ ‭subscripts‬ ‭lower‬ ‭than‬ ‭67‬ ‭completed‬ ‭before‬ ‭the‬
‭checkpoint.‬ ‭Thus,‬ ‭only‬ ‭transactions‬ ‭T67,‬ ‭T69,‬ ‭.‬ ‭.‬ ‭.,‬ ‭T100‬ ‭need‬ ‭to‬ ‭be‬
‭considered‬‭during‬‭the‬‭recovery‬‭scheme.‬‭Each‬‭of‬‭them‬‭needs‬‭to‬‭be‬‭redone‬‭if‬‭it‬
‭has‬ ‭completed‬ ‭(that‬ ‭is,‬ ‭either‬ ‭committed‬ ‭or‬ ‭aborted);‬ ‭otherwise,‬ ‭it‬ ‭was‬
‭incomplete, and needs to be undone.‬
‭Log-based‬ ‭recovery‬ ‭is‬ ‭a‬ ‭technique‬ ‭used‬ ‭in‬‭database‬ ‭management‬ ‭systems‬
‭(DBMS) ‭t‬o‬‭recover‬‭a‬‭database‬‭to‬‭a‬‭consistent‬‭state‬‭in‬‭the‬‭event‬‭of‬‭a‬‭failure‬‭or‬
‭crash.‬ ‭It‬ ‭involves‬ ‭the‬ ‭use‬ ‭of‬ ‭transaction‬ ‭logs,‬ ‭which‬ ‭are‬ ‭records‬ ‭of‬ ‭all‬ ‭the‬
‭transactions performed on the database.‬
‭In‬ ‭log-based‬ ‭recovery,‬ ‭the‬ ‭DBMS‬ ‭uses‬ ‭the‬ ‭transaction‬ ‭log‬ ‭to‬ ‭reconstruct‬‭the‬
‭database‬‭to‬‭a‬‭consistent‬‭state.‬‭The‬‭transaction‬‭log‬‭contains‬‭records‬‭of‬‭all‬‭the‬
‭changes‬‭made‬‭to‬‭the‬‭database,‬‭including‬‭updates,‬‭inserts,‬‭and‬‭deletes.‬‭It‬‭also‬
‭records information about each transaction, such as its start and end times.‬
‭When‬‭a‬‭failure‬‭occurs,‬‭the‬‭DBMS‬‭uses‬‭the‬‭transaction‬‭log‬‭to‬‭determine‬‭which‬
‭transactions‬ ‭were‬ ‭incomplete‬ ‭at‬ ‭the‬ ‭time‬ ‭of‬ ‭the‬ ‭failure.‬ ‭It‬ ‭then‬ ‭performs‬ ‭a‬
‭series‬ ‭of‬ ‭operations‬ ‭to‬ ‭undo‬ ‭the‬ ‭incomplete‬ ‭transactions‬ ‭and‬ ‭redo‬ ‭the‬
‭completed ones. This process is called the redo/undo recovery algorithm.‬
‭The‬ ‭redo‬ ‭operation‬ ‭involves‬ ‭reapplying‬ ‭the‬ ‭changes‬ ‭made‬ ‭by‬ ‭completed‬
‭transactions‬‭that‬‭were‬‭not‬‭yet‬‭saved‬‭to‬‭the‬‭database‬‭at‬‭the‬‭time‬‭of‬‭the‬‭failure.‬
‭This ensures that all changes are applied to the database.‬
‭The‬ ‭undo‬ ‭operation‬ ‭involves‬ ‭undoing‬ ‭the‬ ‭changes‬ ‭made‬ ‭by‬ ‭incomplete‬
‭transactions‬ ‭that‬ ‭were‬ ‭saved‬ ‭to‬ ‭the‬ ‭database‬ ‭at‬ ‭the‬ ‭time‬ ‭of‬ ‭the‬ ‭failure.‬ ‭This‬
‭restores‬ ‭the‬ ‭database‬ ‭to‬ ‭a‬ ‭consistent‬ ‭state‬ ‭by‬ ‭reversing‬ ‭the‬ ‭effects‬ ‭of‬ ‭the‬
‭incomplete transactions.‬
‭Once‬ ‭the‬ ‭redo‬ ‭and‬ ‭undo‬ ‭operations‬ ‭are‬ ‭completed,‬ ‭the‬‭DBMS‬‭can‬‭bring‬‭the‬
‭database back online and resume normal operations.‬
‭Log-based‬‭recovery‬‭is‬‭an‬‭essential‬‭feature‬‭of‬‭modern‬‭DBMSs‬‭and‬‭provides‬‭a‬
‭reliable‬ ‭mechanism‬ ‭for‬ ‭recovering‬‭from‬‭failures‬‭and‬‭ensuring‬‭the‬‭consistency‬
‭of the database.‬
‭Advantages of Log based Recovery‬
‭●‬ ‭ urability: ‭I‬n the event of a breakdown, the log file offers a‬
D
‭dependable and long-lasting method of recovering data. It‬
‭guarantees that in the event of a system crash, no committed‬
‭transaction is lost.‬
‭●‬ ‭Faster Recovery: ‬‭Since log-based recovery recovers‬‭databases by‬
‭replaying committed transactions from the log file, it is typically faster‬
‭than alternative recovery methods.‬
‭●‬ ‭Incremental Backup:‬‭Backups can be made in increments‬‭using‬
‭log-based recovery. Just the changes made since the last backup‬
‭are kept in the log file, rather than creating a complete backup of the‬
‭database each time.‬
‭●‬ ‭Lowers the Risk of Data Corruption:‬‭By making sure‬‭that all‬
‭transactions are correctly committed or canceled before they are‬
‭written to the ‬‭database‬‭, log-based recovery lowers‬‭the risk of data‬
‭corruption.‬
‭Disadvantages of Log based Recovery‬
‭●‬ ‭ dditional overhead:‬‭Maintaining the log file incurs‬‭an additional‬
A
‭overhead on the database system, which can reduce the‬
‭performance of the system.‬
‭●‬ ‭Complexity: ‬‭Log-based recovery is a complex process‬‭that requires‬
‭careful management and administration. If not managed properly, it‬
‭can lead to data inconsistencies or loss.‬
‭●‬ ‭Storage space: ‭T ‬ he log file can consume a significant‬‭amount of‬
‭storage space, especially in a database with a large number of‬
‭transactions.‬
‭●‬ ‭Time-Consuming:‬‭The process of replaying the transactions‬‭from‬
‭the log file can be time-consuming, especially if there are a large‬
‭number of transactions to recover.‬
‭Checkpoints in DBMS‬
‭●‬
‭●‬

‭ he‬ ‭Checkpoint‬ ‭is‬ ‭used‬ ‭to‬ ‭declare‬ ‭a‬ ‭point‬ ‭before‬ ‭which‬ ‭the ‬‭DBMS‬‭was‬ ‭in‬ ‭a‬
T
‭consistent‬ ‭state,‬ ‭and‬ ‭all‬ ‭transactions‬ ‭were‬ ‭committed.‬ ‭During‬ ‭transaction‬
‭execution,‬ ‭such‬ ‭checkpoints‬ ‭are‬ ‭traced.‬ ‭After‬ ‭execution,‬ ‭transaction‬ ‭log‬ ‭files‬
‭will‬ ‭be‬ ‭created.‬ ‭Upon‬ ‭reaching‬ ‭the‬ ‭savepoint/checkpoint,‬ ‭the‬ ‭log‬ ‭file‬ ‭is‬
‭destroyed‬ ‭by‬ ‭saving‬ ‭its‬ ‭update‬ ‭to‬ ‭the‬ ‭database.‬ ‭Then‬ ‭a‬ ‭new‬ ‭log‬ ‭is‬ ‭created‬
‭with‬ ‭upcoming‬ ‭execution‬ ‭operations‬ ‭of‬ ‭the‬ ‭transaction‬ ‭and‬ ‭it‬ ‭will‬‭be‬‭updated‬
‭until the next checkpoint and the process continues.‬
‭Why do We Need Checkpoints?‬
‭ henever‬ ‭transaction‬ ‭logs‬ ‭are‬ ‭created‬ ‭in‬ ‭a‬ ‭real-time‬‭environment,‬‭it‬‭eats‬‭up‬
W
‭lots‬‭of‬‭storage‬‭space.‬‭Also‬‭keeping‬‭track‬‭of‬‭every‬‭update‬‭and‬‭its‬‭maintenance‬
‭may‬‭increase‬‭the‬‭physical‬‭space‬‭of‬‭the‬‭system.‬‭Eventually,‬‭the‬‭transaction‬‭log‬
‭file‬ ‭may‬ ‭not‬ ‭be‬ ‭handled‬ ‭as‬ ‭the‬ ‭size‬ ‭keeps‬ ‭growing.‬ ‭This‬ ‭can‬ ‭be‬ ‭addressed‬
‭with‬ ‭checkpoints.‬ ‭The‬ ‭methodology‬ ‭utilized‬ ‭for‬ ‭removing‬ ‭all‬ ‭previous‬
t‭ransaction‬ ‭logs‬ ‭and‬ ‭storing‬ ‭them‬ ‭in‬ ‭permanent‬ ‭storage‬ ‭is‬ ‭called‬ ‭a‬
‭Checkpoint. ‬

‭Steps to Use Checkpoints in the Database‬

‭ .‬ ‭Write the begin_checkpoint record into a log.‬
1
‭2.‬ ‭Collect checkpoint data in stable storage.‬
‭3.‬ ‭Write the end_checkpoint record into a log.‬
‭The‬ ‭behavior‬ ‭when‬ ‭the‬ ‭system‬ ‭crashes‬ ‭and‬ ‭recovers‬ ‭when‬ ‭concurrent‬
‭transactions are executed is shown below:‬

‭Understanding Checkpoints in multiple Transactions‬

‭Transactions and operations of the above diagram:‬

‭Transaction 1‬ ‭Transaction 2‬ ‭Transaction 3‬ ‭Transaction 4‬

‭(T1) ‬ ‭(T2) ‬ ‭(T3) ‬ ‭(T4) ‬

‭START‬ ‭‬ ‭‬ ‭‬

‭‬ ‭START‬ ‭‬ ‭‬

‭‬ ‭COMMIT‬ ‭‬ ‭‬

‭‬ ‭‬ ‭START‬ ‭‬

‭‬ ‭‬ ‭COMMIT‬ ‭‬

‭‬ ‭‬ ‭‬ ‭START‬
‭Transaction 1‬ ‭Transaction 2‬ ‭Transaction 3‬ ‭Transaction 4‬
‭(T1) ‬ ‭(T2) ‬ ‭(T3) ‬ ‭(T4) ‬

‭‬ ‭‬ ‭‬ ‭FAILURE‬

‭●‬ ‭The‬ ‭recovery‬ ‭system‬ ‭reads‬ ‭the‬ ‭logs‬ ‭backward‬ ‭from‬ ‭the‬ ‭end‬ ‭to‬ ‭the‬
l‭ast checkpoint i.e. from T4 to T1.‬
‭●‬ ‭It will keep track of two lists – Undo and Redo.‬
‭●‬ ‭Whenever‬ ‭there‬ ‭is‬ ‭a‬ ‭log‬ ‭with‬ ‭instructions‬ ‭<Tn,‬ ‭start>and‬ ‭<Tn,‬
‭commit>‬ ‭or‬ ‭only‬ ‭<Tn,‬ ‭commit>‬ ‭then‬ ‭it‬ ‭will‬ ‭put‬ ‭that‬ ‭transaction‬ ‭in‬
‭Redo‬ ‭List.‬ ‭T2‬ ‭and‬ ‭T3‬ ‭contain‬ ‭<Tn,‬ ‭Start>‬ ‭and‬ ‭<Tn,‬ ‭Commit>‬
‭whereas‬ ‭T1‬ ‭will‬ ‭have‬ ‭only‬‭<Tn,‬‭Commit>.‬‭Here,‬‭T1,‬‭T2,‬‭and‬‭T3‬‭are‬
‭in the redo list.‬
‭●‬ ‭Whenever‬ ‭a‬ ‭log‬ ‭record‬ ‭with‬ ‭no‬ ‭instruction‬ ‭of‬ ‭commit‬ ‭or‬ ‭abort‬ ‭is‬
‭found,‬‭that‬‭transaction‬‭is‬‭put‬‭to‬‭Undo‬‭List‬‭<Here,‬‭T4‬‭has‬‭<Tn,‬‭Start>‬
‭but‬‭no‬‭<Tn,‬‭commit>‬‭as‬‭it‬‭is‬‭an‬‭ongoing‬‭transaction.‬‭T4‬‭will‬‭be‬‭put‬‭on‬
‭the undo list.‬
‭All‬ ‭the‬ ‭transactions‬ ‭in‬ ‭the‬ ‭redo‬ ‭list‬ ‭are‬ ‭deleted‬ ‭with‬ ‭their‬ ‭previous‬ ‭logs‬ ‭and‬
‭then‬ ‭redone‬ ‭before‬ ‭saving‬ ‭their‬ ‭logs.‬ ‭All‬ ‭the‬ ‭transactions‬ ‭in‬‭the‬‭undo‬‭list‬‭are‬
‭undone and their logs are deleted. ‬

‭Types of Checkpoints‬
‭There are basically two main types of Checkpoints:‬
‭ .‬ ‭Automatic Checkpoint‬
1
‭2.‬ ‭Manual Checkpoint‬
‭1.‬‭Automatic‬‭Checkpoint:‬‭These‬‭checkpoints‬‭occur‬‭very‬‭frequently‬‭like‬‭every‬
‭hour‬‭or‬‭every‬‭day.‬‭These‬‭intervals‬‭are‬‭set‬‭by‬‭the ‭d ‬ atabase‬‭administrator.‬‭They‬
‭are‬ ‭generally‬ ‭used‬ ‭by‬ ‭heavy‬ ‭databases‬ ‭as‬ ‭they‬ ‭are‬ ‭frequently‬ ‭updated,‬ ‭and‬
‭we can recover the data easily in case of failure.‬
‭2.‬ ‭Manual‬ ‭Checkpoint:‬‭These‬ ‭are‬ ‭the‬ ‭checkpoints‬ ‭that‬ ‭are‬ ‭manually‬ ‭set‬ ‭by‬
‭the‬‭database‬‭administrator.‬‭Manual‬‭checkpoints‬‭are‬‭generally‬‭used‬‭for‬‭smaller‬
‭databases.‬ ‭They‬ ‭are‬ ‭updated‬ ‭very‬ ‭less‬ ‭frequently‬ ‭only‬ ‭when‬‭they‬‭are‬‭set‬‭by‬
‭the database administrator.‬
‭Relevance of Checkpoints‬
‭ ‬‭checkpoint‬‭is‬‭a‬‭feature‬‭that‬‭adds‬‭a‬‭value‬‭of‬‭C‬‭in ‬‭ACID‬‭-compliant‬‭to ‬‭RDBMS‬‭.‬
A
‭A‬ ‭checkpoint‬ ‭is‬ ‭used‬ ‭for‬ ‭recovery‬ ‭if‬ ‭there‬ ‭is‬ ‭an‬ ‭unexpected‬ ‭shutdown‬ ‭in‬ ‭the‬
‭database.‬ ‭Checkpoints‬ ‭work‬ ‭on‬ ‭some‬ ‭intervals‬ ‭and‬ ‭write‬ ‭all‬ ‭dirty‬ ‭pages‬
‭(modified‬‭pages)‬‭from‬‭logs‬‭relay‬‭to‬‭data‬‭file‬‭from‬‭i.e‬‭from‬‭a‬‭buffer‬‭to‬‭a‬‭physical‬
‭disk.‬‭It‬‭is‬‭also‬‭known‬‭as‬‭the‬‭hardening‬‭of‬‭dirty‬‭pages.‬‭It‬‭is‬‭a‬‭dedicated‬‭process‬
‭and‬ ‭runs‬ ‭automatically‬ ‭by ‬‭SQL‬‭Server‬ ‭at‬ ‭specific‬ ‭intervals.‬ ‭The‬
‭synchronization‬‭point‬‭between‬‭the‬‭database‬‭and‬‭transaction‬‭log‬‭is‬‭served‬‭with‬
‭a checkpoint. ‬
‭Advantages of Checkpoints‬
‭●‬ ‭Checkpoints‬‭help‬‭us‬‭in‬‭recovering‬‭the‬‭transaction‬‭of‬‭the‬‭database‬‭in‬
c‭ ase of a random shutdown of the database.‬
‭●‬ ‭It‬ ‭enhancing‬ ‭the‬ ‭consistency‬ ‭of‬ ‭the‬ ‭database‬‭in‬‭case‬‭when‬‭multiple‬
‭transactions are executing in the database simultaneously.‬
‭‬
● ‭It increasing the data recovery process.‬
‭●‬ ‭Checkpoints‬ ‭work‬ ‭as‬ ‭a‬ ‭synchronization‬‭point‬‭between‬‭the‬‭database‬
‭and the transaction log file in the database.‬
‭●‬ ‭Checkpoint‬ ‭records‬ ‭in‬ ‭the‬ ‭log‬ ‭file‬ ‭are‬ ‭used‬ ‭to‬ ‭prevent‬ ‭unnecessary‬
‭redo operations.‬
‭●‬ ‭Since‬ ‭dirty‬ ‭pages‬ ‭are‬ ‭flushed‬ ‭out‬‭continuously‬‭in‬‭the‬‭background,‬‭it‬
‭has a very low overhead and can be done frequently.‬
‭●‬ ‭Checkpoints‬ ‭provide‬ ‭the‬ ‭baseline‬ ‭information‬ ‭needed‬ ‭for‬ ‭the‬
‭restoration of the lost state in the event of a system failure.‬
‭●‬ ‭A‬ ‭database‬ ‭checkpoint‬ ‭keeps‬ ‭track‬ ‭of‬ ‭change‬ ‭information‬ ‭and‬
‭enables incremental database backup. ‬
‭●‬ ‭A‬‭database‬‭storage‬‭checkpoint‬‭can‬‭be‬‭mounted,‬‭allowing‬‭regular‬‭file‬
‭system operations to be performed.‬
‭●‬ ‭Database‬ ‭checkpoints‬ ‭can‬ ‭be‬ ‭used‬ ‭for‬ ‭application‬ ‭solutions‬ ‭which‬
‭include backup, recovery or database modifications.‬
‭Disadvantages of Checkpoints‬
‭ . Database storage checkpoints can only be used to restore from logical‬
1
‭errors (E.g. a human error).‬
‭ . Because all the data blocks are on the same physical device, database‬
2
‭storage checkpoints cannot be used to restore files due to a media failure.‬

‭Real-Time Applications of Checkpoints‬

‭ .‬ ‭Backup and Recovery‬
1
‭2.‬ ‭Performance Optimization‬
‭3.‬ ‭Auditing‬

‭Deadlock in DBMS‬
‭●‬
‭●‬
‭●‬
I‭n‬ ‭a‬ ‭database‬ ‭management‬ ‭system‬ ‭(DBMS),‬ ‭a‬‭deadlock‬‭occurs‬‭when‬‭two‬‭or‬
‭more‬ ‭transactions‬ ‭are‬ ‭waiting‬ ‭for‬ ‭each‬ ‭other‬ ‭to‬ ‭release‬ ‭resources,‬ ‭such‬ ‭as‬
‭locks‬ ‭on‬ ‭database‬ ‭objects,‬ ‭that‬ ‭they‬ ‭need‬ ‭to‬‭complete‬‭their‬‭operations.‬‭As‬‭a‬
‭result,‬‭none‬‭of‬‭the‬‭transactions‬‭can‬‭proceed,‬‭leading‬‭to‬‭a‬‭situation‬‭where‬‭they‬
‭are stuck or “deadlocked.”‬
‭ eadlocks‬ ‭can‬ ‭happen‬ ‭in‬ ‭multi-user‬ ‭environments‬ ‭when‬ ‭two‬ ‭or‬ ‭more‬
D
‭transactions‬ ‭are‬ ‭running‬ ‭concurrently‬ ‭and‬ ‭try‬ ‭to‬ ‭access‬ ‭the‬ ‭same‬ ‭data‬ ‭in‬ ‭a‬
‭different‬ ‭order.‬ ‭When‬ ‭this‬ ‭happens,‬ ‭one‬ ‭transaction‬ ‭may‬ ‭hold‬ ‭a‬ ‭lock‬ ‭on‬ ‭a‬
‭resource‬ ‭that‬ ‭another‬ ‭transaction‬ ‭needs,‬ ‭while‬ ‭the‬ ‭second‬ ‭transaction‬ ‭may‬
‭ old‬ ‭a‬ ‭lock‬ ‭on‬ ‭a‬ ‭resource‬ ‭that‬ ‭the‬ ‭first‬ ‭transaction‬ ‭needs.‬ ‭Both‬ ‭transactions‬
h
‭are then blocked, waiting for the other to release the resource they need.‬
‭ BMSs‬ ‭often‬ ‭use‬ ‭various‬ ‭techniques‬ ‭to‬ ‭detect‬ ‭and‬ ‭resolve‬ ‭deadlocks‬
D
‭automatically.‬ ‭These‬ ‭techniques‬ ‭include‬ ‭timeout‬ ‭mechanisms,‬ ‭where‬ ‭a‬
‭transaction‬ ‭is‬ ‭forced‬ ‭to‬ ‭release‬ ‭its‬ ‭locks‬ ‭after‬ ‭a‬ ‭certain‬ ‭period‬ ‭of‬ ‭time,‬ ‭and‬
‭deadlock‬ ‭detection‬ ‭algorithms,‬ ‭which‬ ‭periodically‬ ‭scan‬‭the‬‭transaction‬‭log‬‭for‬
‭deadlock‬ ‭cycles‬ ‭and‬ ‭then‬ ‭choose‬ ‭a‬ ‭transaction‬ ‭to‬ ‭abort‬ ‭to‬ ‭resolve‬ ‭the‬
‭deadlock.‬
I‭t‬‭is‬‭also‬‭possible‬‭to‬‭prevent‬‭deadlocks‬‭by‬‭careful‬‭design‬‭of‬‭transactions,‬‭such‬
‭as‬ ‭always‬ ‭acquiring‬ ‭locks‬ ‭in‬ ‭the‬ ‭same‬ ‭order‬ ‭or‬ ‭releasing‬ ‭locks‬ ‭as‬ ‭soon‬ ‭as‬
‭possible.‬‭Proper‬‭design‬‭of‬‭the‬‭database‬‭schema‬‭and‬‭application‬‭can‬‭also‬‭help‬
‭to minimize the likelihood of deadlocks‬
I‭n‬ ‭a‬ ‭database,‬ ‭a‬ ‭deadlock‬ ‭is‬ ‭an‬ ‭unwanted‬ ‭situation‬ ‭in‬ ‭which‬ ‭two‬ ‭or‬ ‭more‬
‭transactions‬‭are‬‭waiting‬‭indefinitely‬‭for‬‭one‬‭another‬‭to‬‭give‬‭up‬‭locks.‬‭Deadlock‬
‭is‬ ‭said‬ ‭to‬ ‭be‬ ‭one‬ ‭of‬ ‭the‬ ‭most‬ ‭feared‬ ‭complications‬ ‭in‬ ‭DBMS‬ ‭as‬ ‭it‬ ‭brings‬ ‭the‬
‭whole system to a Halt. ‬
‭Example –‬‭let us understand the concept of Deadlock‬‭with an example : ‬
‭Suppose,‬ ‭Transaction‬ ‭T1‬ ‭holds‬ ‭a‬ ‭lock‬ ‭on‬ ‭some‬ ‭rows‬ ‭in‬ ‭the‬ ‭Students‬ ‭table‬
‭and ‭n ‬ eeds‬ ‭to‬ ‭update‬‭some‬ ‭rows‬ ‭in‬ ‭the‬ ‭Grades‬ ‭table.‬ ‭Simultaneously,‬
‭Transaction ‬‭T2‬‭holds‬‭locks‬‭on‬‭those‬‭very‬‭rows‬‭(Which‬‭T1‬‭needs‬‭to‬‭update)‬‭in‬
‭the‬ ‭Grades‬ ‭table ‭b ‬ ut‬ ‭needs‬‭to‬ ‭update‬ ‭the‬ ‭rows‬ ‭in‬ ‭the‬‭Student‬‭table ‬‭held‬‭by‬
‭Transaction T1‬‭. ‬
‭Now, the main problem arises. Transaction T1 will wait for transaction T2 to‬
‭give up the lock, and similarly, transaction T2 will wait for transaction T1 to‬
‭give up the lock. As a consequence, All activity comes to a halt and remains at‬
‭a standstill forever unless the DBMS detects the deadlock and aborts one of‬
‭the transactions. ‬

‭Deadlock in DBMS‬
‭ eadlock Avoidance: ‬‭When a database is stuck in a deadlock, It is always‬
D
‭better to avoid the deadlock rather than restarting or aborting the database.‬
‭The deadlock avoidance method is suitable for smaller databases whereas‬
‭the deadlock prevention method is suitable for larger databases. ‬
‭One method of avoiding deadlock is using application-consistent logic. In the‬
‭above-given example, Transactions that access Students and Grades should‬
‭always access the tables in the same order. In this way, in the scenario‬
‭described above, Transaction T1 simply waits for transaction T2 to release the‬
‭lock on Grades before it begins. When transaction T2 releases the lock,‬
‭Transaction T1 can proceed freely. ‬
‭Another method for avoiding deadlock is to apply both the row-level locking‬
‭mechanism and the READ COMMITTED isolation level. However, It does not‬
‭guarantee to remove deadlocks completely. ‬
‭ eadlock Detection: ‭W
D ‬ hen a transaction waits indefinitely‬‭to obtain a lock,‬
‭The database management system should detect whether the transaction is‬
‭involved in a deadlock or not. ‬
‭Wait-for-graph‬‭is one of the methods for detecting‬‭the deadlock situation.‬
‭This method is suitable for smaller databases. In this method, a graph is‬
‭drawn based on the transaction and its lock on the resource. If the graph‬
‭created has a closed loop or a cycle, then there is a deadlock. ‬
‭For the above-mentioned scenario, the Wait-For graph is drawn below:‬

‭ eadlock prevention: ‭F
D ‬ or a large database, the deadlock‬‭prevention method‬
‭is suitable. A deadlock can be prevented if the resources are allocated in such‬
‭a way that a deadlock never occurs. The DBMS analyzes the operations‬
‭whether they can create a deadlock situation or not, If they do, that transaction‬
‭is never allowed to be executed. ‬
‭Deadlock prevention mechanism proposes two schemes: ‬
‭●‬ ‭ ait-Die Scheme: ‬‭In this scheme, If a transaction‬‭requests a‬
W
‭resource that is locked by another transaction, then the DBMS‬
‭simply checks the timestamp of both transactions and allows the‬
‭older transaction to wait until the resource is available for execution. ‬
‭Suppose, there are two transactions T1 and T2, and Let the‬
‭timestamp of any transaction T be TS (T). Now, If there is a lock on‬
‭T2 by some other transaction and T1 is requesting resources held by‬
‭T2, then DBMS performs the following actions: ‬
‭Checks if TS (T1) < TS (T2) – if T1 is the older transaction and T2‬
‭has held some resource, then it allows T1 to wait until resource is‬
‭available for execution. That means if a younger transaction has‬
‭locked some resource and an older transaction is waiting for it, then‬
‭an older transaction is allowed to wait for it till it is available. If T1 is‬
‭an older transaction and has held some resource with it and if T2 is‬
‭waiting for it, then T2 is killed and restarted later with random delay‬
‭but with the same timestamp. i.e. if the older transaction has held‬
‭some resource and the younger transaction waits for the resource,‬
‭then the younger transaction is killed and restarted with a very‬
‭minute delay with the same timestamp. ‬
‭This scheme allows the older transaction to wait but kills the younger‬
‭one. ‬

‭●‬‭ ound Wait Scheme: ‬‭In this scheme, if an older transaction‬

W
‭requests for a resource held by a younger transaction, then an older‬
‭transaction forces a younger transaction to kill the transaction and‬
‭release the resource. The younger transaction is restarted with a‬
‭minute delay but with the same timestamp. If the younger transaction‬
‭is requesting a resource that is held by an older one, then the‬
‭younger transaction is asked to wait till the older one releases it. ‬
‭The following table lists the differences between Wait – Die and Wound -Wait‬
‭scheme prevention schemes:‬

‭Wait – Die ‬ ‭Wound -Wait‬

‭It is based on a non-preemptive technique.‬ ‭It is based on a preemptive technique.‬

‭In this, older transactions must wait for the‬ ‭In this, older transactions never wait‬
‭younger one to release its data items.‬ ‭for younger transactions. ‬

‭The number of aborts and rollbacks is higher‬ ‭In this, the number of aborts and‬
‭in these techniques.‬ ‭rollback is lesser.‬
‭Distributed Database System‬
‭●‬
‭●‬
‭●‬
‭ distributed database is basically a database that is not limited to one‬
A
‭system, it is spread over different sites, i.e, on multiple computers or over a‬
‭network of computers. A distributed database system is located on various‬
‭sites that don’t share physical components. This may be required when a‬
‭particular database needs to be accessed by various users globally. It needs‬
‭to be managed such that for the users it looks like one single database. ‬
‭ ypes: ‬
T
‭1. Homogeneous Database: ‬
‭In a homogeneous database, all different sites store database identically. The‬
‭operating system, database management system, and the data structures‬
‭used – all are the same at all sites. Hence, they’re easy to manage. ‬
‭2. Heterogeneous Database: ‬
‭In a heterogeneous distributed database, different sites can use different‬
‭schema and software that can lead to problems in query processing and‬
‭transactions. Also, a particular site might be completely unaware of the other‬
‭sites. Different computers may use a different operating system, different‬
‭database application. They may even use different data models for the‬
‭database. Hence, translations are required for different sites to communicate. ‬

‭ istributed Data Storage :‬

D
‭There are 2 ways in which data can be stored on different sites. These are: ‬
‭1. Replication –‬
‭In this approach, the entire relationship is stored redundantly at 2 or more‬
‭sites. If the entire database is available at all sites, it is a fully redundant‬
‭database. Hence, in replication, systems maintain copies of data. ‬
‭This is advantageous as it increases the availability of data at different sites.‬
‭Also, now query requests can be processed in parallel. ‬
‭However, it has certain disadvantages as well. Data needs to be constantly‬
‭updated. Any change made at one site needs to be recorded at every site that‬
‭relation is stored or else it may lead to inconsistency. This is a lot of overhead.‬
‭Also, concurrency control becomes way more complex as concurrent access‬
‭now needs to be checked over a number of sites. ‬
‭ . Fragmentation –‬
2
‭In this approach, the relations are fragmented (i.e., they’re divided into smaller‬
‭parts) and each of the fragments is stored in different sites where they’re‬
‭required. It must be made sure that the fragments are such that they can be‬
‭used to reconstruct the original relation (i.e, there isn’t any loss of data). ‬
‭Fragmentation is advantageous as it doesn’t create copies of data,‬
‭consistency is not a problem. ‬

‭Fragmentation of relations can be done in two ways: ‬

‭●‬ ‭ orizontal fragmentation – Splitting by rows –‬‭‬

H
‭The relation is fragmented into groups of tuples so that each tuple is‬
‭assigned to at least one fragment.‬
‭●‬ ‭Vertical fragmentation – Splitting by columns –‬‭‬
‭The schema of the relation is divided into smaller schemas. Each‬
‭fragment must contain a common candidate key so as to ensure a‬
‭lossless join.‬
‭In certain cases, an approach that is hybrid of fragmentation and replication is‬
‭used. ‬
‭Applications of Distributed Database:‬
‭●‬ ‭It is used in Corporate Management Information System.‬
‭●‬ ‭It is used in multimedia applications.‬
‭●‬ ‭Used in Military’s control system, Hotel chains etc.‬
‭●‬ ‭It is also used in manufacturing control system.‬

Common questions