0% found this document useful (0 votes)
4 views27 pages

Adv Dbms

The document outlines various units related to relational database design and management, focusing on PostgreSQL and its procedural language PL/pgSQL. It covers essential concepts such as transaction management, concurrency control, crash recovery, database security, and system architectures. Each unit provides detailed insights into principles, methods, and mechanisms to ensure data integrity, consistency, and security in database systems.

Uploaded by

Ashwini Khade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
4 views27 pages

Adv Dbms

The document outlines various units related to relational database design and management, focusing on PostgreSQL and its procedural language PL/pgSQL. It covers essential concepts such as transaction management, concurrency control, crash recovery, database security, and system architectures. Each unit provides detailed insights into principles, methods, and mechanisms to ensure data integrity, consistency, and security in database systems.

Uploaded by

Ashwini Khade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 27
Unit-l: Relational Database Design & PL/PostgreSQL This unit focuses on the structured way data is organized and managed, particularly within the PostgreSQL database system using its procedural language. e Relational Database Design: This involves organizing data into tables (relations) with rows and columns, establishing relationships between different entities using keys (primary and foreign keys) to maintain data integrity and avoid redundancy. Key principles include clear table definitions, logical relationships, and well-defined purposes for tables (Source 1.1). PL/PostgreSQL (PL/pgSQL): This is PostgreSQLs native procedural language, extending SQLs functionality. It allows for procedural programming logic like: Language structure: Defining blocks of code for functions, procedures, and triggers (Source 1.2, 1.3). Controlling the program flow, conditional statements (IF, ELSIF, ELSE), loops (LOOP, FOR, WHILE): These constructs allow for complex logic and iteration within database operations (Source 1.3). Views: Virtual tables based on the result-set of a SQL query, used to simplify complex queries and restrict data access. Functions and Procedures: Reusable code blocks encapsulated for specific tasks. Functions typically return a value, while procedures are designed for side effects (modifying data) and don't necessarily return a value (Source oye Handling errors and exceptions: Mechanisms (BEGIN...EXCEPTION...END) to gracefully manage runtime errors and ensure data integrity (Source 1.3). Cursors: Used to process a result set one row at a time, often within loops, for operations that cannot be achieved with a single SQL query (Source 1.3). e Triggers: Special stored procedures that automatically execute when a specific event (like INSERT, UPDATE, DELETE) occurs on a table (Source 1.2, les): e Packages: (More common in Oracle PL/ SQL, less explicitly defined as a distinct feature in PostgreSQL PL/pgSQL in the same way, but related concepts are handled through schema organization of functions/procedures). In general, packages group related functions, procedures, variables, and other PL/ SQL constructs. Unit-Il: Transaction Concepts This unit covers the fundamental principles that ensure data integrity and reliability in a multi-user database environment. e Transaction: A logical unit of work performed against a database, treated as a single, indivisible operation. It comprises one or more data manipulation statements (e.g., CREATE, READ, UPDATE, DELETE) and queries (Source 2.1, 2.2). e Properties of Transaction (ACID Properties): These are crucial for maintaining data integrity: e Atomicity: All operations within a transaction either complete successfully or none of them do (“all or nothing") (Source 2.2, 2.3). e Consistency: A transaction brings the database from one valid consistent state to another, ensuring all integrity rules are maintained (Source 2.2, 2.3). Isolation: Concurrent transactions execute independently without interfering with each other. The effects of an incomplete transaction are not visible to other transactions (Source (fd, Ds). Durability: Once a transaction is committed, its changes are permanently stored and survive system failures (Source 2.2, 2.3). States of Transactions: A transaction progresses through various states: Active: The initial state, where the transaction is being executed. Partially Committed: After the final operation is executed. Committed: All operations are successful, and changes are permanent. Failed: An error occurred, and the transaction cannot proceed. Aborted: The transaction is rolled back, undoing all its changes (Source 2.2). Concurrent Execution of Transactions and Conflicting Operations: When multiple transactions run simultaneously, they might try to access or modify the same data, leading to conflicts (e.g., lost updates, dirty reads, non-repeatable reads). e Schedules, Types of Schedules: A schedule represents the chronological order of operations from various interleaved transactions. Different types (e.g., serial, non-serial) describe how transactions' operations are interleaved. ¢ Concept of Serializability: The goal of concurrency control. A concurrent schedule is serializable if its effect is equivalent to some serial schedule (where transactions execute one after another without interleaving), ensuring correctness despite concurrency (Source 2.3, 3.2). e Precedence Graph for Serializability: A directed graph used to check for serializability. If the graph contains no cycles, the schedule is serializable (Source 2.3). Unit-Ill: Concurrency Control This unit explores methods to manage simultaneous transactions to ensure data consistency and integrity. e Ensuring Serializability by Locks: Locks are mechanisms to control access to data items, ensuring mutual exclusion. e Different Lock Modes: e Shared Lock (S-lock/Read Lock): Allows multiple transactions to read the same data concurrently but prevents any writes. e Exclusive Lock (X-lock/Write Lock): Allows only one transaction to read and write a data item, preventing any other access (read or write) (Source 3.3, 3.4). Two-Phase Locking (2PL) and its Variations: A protocol that ensures serializability by operating in two phases: Growing Phase: A transaction can acquire new locks but cannot release any. Shrinking Phase: A transaction can release locks but cannot acquire any new ones (Source 3.3, 3.4). Variations like Strict 2PL ensure even stronger isolation by holding all exclusive locks until commit (Source 2.3). e Multiple Granularity Locking Protocol: Allows locking at different levels of data granularity (e.g., entire database, table, page, record). This optimizes concurrency by allowing more specific locking (e.g., locking only a single row instead of the entire table). e Basic Timestamp Method for Concurrency: Assigns a unique timestamp to each transaction. The system processes operations based on the timestamps to maintain consistency, rejecting operations that violate timestamp order (Source 3.2, 3.3). e Thomas Write Rule: A modification to the timestamp ordering protocol that allows certain "obsolete" write operations to be ignored, potentially increasing concurrency. e Dynamic Database Concurrency (Phantom Problem): Occurs when a transaction reads a set of rows, another transaction inserts new rows that meet the criteria of the first transaction's read, and then the first transaction re-reads and sees "phantom" new rows. Concurrency control mechanisms like stricter isolation levels (e.g., Serializable) or specific locking prevent this. e Timestamps versus Locking: Two primary approaches to concurrency control, each with pros and cons regarding implementation complexity, performance, and deadlock potential. ¢ Optimistic Concurrency Control Algorithm: Assumes conflicts are rare. Transactions execute without locks, validate their changes at commit time to check for conflicts, and roll back if a conflict is detected. This can offer higher concurrency but risks aborting transactions (Source 3.1). Multi-version Concurrency Control (MVCC): Allows multiple versions of a data item to exist simultaneously. When a transaction reads data, it sees a consistent snapshot of the database, often the version that was current when the transaction started. This avoids read-write conflicts and deadlocks by allowing readers not to block writers and writers not to block readers (Source 3.2, 3.4). PostgreSQL uses MVCC extensively. e Deadlock Handling Methods: e Detection and Recovery (Wait-for Graph): Allows deadlocks to occur, then detects them using a wait-for graph (nodes are transactions, edges indicate "waits for"). Once detected, one or more transactions are chosen as victims and aborted/rolled back to break the deadlock (Source 3.4). e Prevention Algorithms (Wound-wait, Wait-die): Strategies to ensure deadlocks never occur by imposing rules on how transactions acquire locks. e Wound-Wait: Older transactions "wound" (force rollback) younger transactions that hold needed resources. e Wait-Die: Older transactions wait for younger transactions; younger transactions "die" (rollback) if they request a resource held by an older transaction. Unit-IV: Crash Recovery This unit focuses on mechanisms to restore the database to a consistent state after a system failure. e Transaction Failure Classification: Failures can be due to: e Transaction Errors: Logical errors in the transaction. e System Errors: Software errors, operating system failures. e Hardware Failures: Disk crashes, power failures (Source 4.1). e Catastrophic Failures: Loss of entire storage. e Recovery Concepts: The goal is to bring the database back to a consistent state after a crash by undoing incomplete transactions and redoing committed transactions that might not have been fully written to disk (Source 4.1, 4.2, 4.3). e Checkpoints: Periodic operations where the DBMS flushes all modified in- memory data pages to disk and records a special "checkpoint" entry in the log. This reduces the amount of log data that needs to be replayed during recovery (Source 4.3, 4.4). Recovery with Concurrent Transactions (Rollback, Checkpoints, Commit): Recovery procedures must account for transactions that were active or partially committed at the time of the crash, using logs and checkpoints to determine what to rollback (undo uncommitted changes) and what to commit/redo (ensure committed changes are persistent). Log-Based Recovery Techniques (Deferred and Immediate Update): Log: A transaction log (or redo log/ write-ahead log - WAL) is a sequential record of all database modifications. It contains sufficient information to undo or redo any transaction (Source 4.3, 4.4). Deferred Update (No-Undo/Redo): All changes are temporarily stored in local buffers and written to the database only after the transaction commits. If a crash occurs before commit, no undo is needed. Redo is required for committed transactions. Immediate Update (Undo/Redo): Changes are written to the database disk as they occur during the transaction. Requires both undo (for aborted/uncommitted transactions) and redo (for committed transactions that might not have been fully written to disk) (Source 4.4). Buffer Management: The DBMS uses a buffer pool (memory cache) to hold data pages. Recovery interacts with how data is moved between buffers and disk. e Shadow Paging: A recovery technique that maintains two page tables (current and shadow). Updates are applied to a new set of pages, and only after commit is the shadow page table updated to become the current one, making the changes visible. This simplifies recovery as no undo/redo is needed for committed transactions. Unit-V: Database Security This unit covers the measures and controls used to protect the database from unauthorized access, misuse, and data corruption. Introduction to Database Security Concepts: Focuses on preserving the Confidentiality, Integrity, and Availability (CIA triad) of data (Source 5.1, 5.2). It involves protecting the data, the DBMS, associated applications, and the underlying infrastructure. Methods for Database Security: Authentication: Verifying the identity of users accessing the database (Source 5.2). Authorization: Granting specific privileges or permissions to authenticated users, determining what data they can access and what operations they can perform (Source 5.2). Auditing: Recording all activities (logins, data access, modifications) to track usage and detect suspicious behavior (Source 5.1, 5.2). Encryption: Protecting data confidentiality by transforming it into an unreadable format, both at rest (stored on disk) and in transit (during transmission) (Source 5.1, 5.2). Backup Security: Ensuring that backups are as secure as the live database. Discretionary Access Control (DAC) Method: The owner of a data object can grant and revoke access privileges to other users. It's "discretionary" because the owner has discretion over who accesses their objects. e Mandatory Access Control (MAC) and Role-Based Access Control (RBAC) for Multilevel Security: e Mandatory Access Control (MAC): A more rigid access control model where access decisions are based on security labels (sensitivity levels of data and clearance levels of users). Access is granted only if the user's clearance matches or exceeds the data's sensitivity. Often used in highly secure environments. e Role-Based Access Control (RBAC): Users are assigned roles, and permissions are granted to roles. Users inherit the permissions of the roles they are assigned. This simplifies security administration, especially in large organizations (Source 5.2). Multilevel Security: Designing systems to handle data with different sensitivity levels and users with different clearance levels simultaneously, ensuring that users can only access data for which they have appropriate clearance. Overview of Encryption Technique for Security: Discusses symmetric and asymmetric encryption, hashing, and their application in protecting sensitive data in databases. Statistical Database Security: Deals with protecting individual privacy while allowing statistical queries on a database. Aims to prevent inferring sensitive individual data from aggregate statistics. Unit-VI: Database System Architectures This unit explores how database systems are structured and deployed, from single- server setups to complex distributed environments. e Centralized and Client-Server Architectures: e Centralized Architecture: All database components (data, DBMS, application logic, user interface) reside on a single machine. Common for small-scale applications or development (Source O72, @.8)). e Client-Server Architecture (2-Tier): The database resides on a server, and client applications on separate machines interact with it. The client handles the user interface and some application logic, while the server handles data management. This improves scalability and resource utilization (Source 6.1, G7), e 3-Tier Architecture: Adds a middle (application) tier between the client and the database server. The client (presentation tier) interacts with the application tier, which then communicates with the database tier. This provides better scalability, security, and maintainability by decoupling concerns (Source 6.1, 6.2). e Server System Architectures: Refers to the internal structure of the database server, including components like query processor, storage manager, transaction manager, etc. (Source 6.3). Introduction to Parallel Systems: Database systems designed to utilize multiple CPUs or I/O devices to execute operations concurrently, improving performance for large datasets and complex queries. They can be shared-memory, shared-disk, or shared-nothing architectures. Introduction to Distributed Systems: Database systems where data is stored across multiple interconnected computers (nodes) in a network. This offers advantages like increased availability, scalability, and local autonomy. Data can be fragmented and replicated across nodes. Introduction to Object-Based Databases: Database systems that model data as objects, similar to object- oriented programming. They allow for complex data types, inheritance, and encapsulation, going beyond the traditional relational model.

You might also like