0% found this document useful (0 votes)

101 views21 pages

Transaction File Organization and Indexing

According to DU BSC CS Syllabus for DBMS

Uploaded by

avinash.22512

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views21 pages

Transaction File Organization and Indexing

According to DU BSC CS Syllabus for DBMS

Uploaded by

avinash.22512

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

💻

Unit 6: File indexing and

Transaction Processing
16.6 Files of Unordered Records (Heap Files)

Overview
Heap File Definition:

The simplest type of file organization where records are placed in the
order they are inserted.

New records are added at the end of the file.

Also known as a heap or pile file.

Often used with additional access paths like secondary indexes for
efficient searching and retrieval.

Insertion of Records
Efficiency:

Very efficient due to the simplicity of appending records.

The last disk block of the file is copied into a buffer, the new record is
appended, and the block is rewritten back to the disk.

Unit 6: File indexing and Transaction Processing 1

The address of the last file block is stored in the file header.

Searching for Records

Search Procedure:

Involves a linear search through the file block by block, making it an

expensive operation.

On average, if one record satisfies the search condition, half of the file
blocks must be read and searched.

For a file with \( b \) blocks, this requires searching 2b blocks on

average.

If no records or multiple records satisfy the search condition, all \( b \)

blocks must be searched.

Deleting Records
Procedure:

Locate the block containing the record, copy the block into a buffer,
delete the record from the buffer, and rewrite the block back to the disk.

This leaves unused space in the disk block, leading to wasted storage if
many records are deleted.

Deletion Marker:

An alternative method involves using an extra byte or bit (deletion

marker) stored with each record.

A specific value of the marker indicates a deleted record.

Search programs only consider records without the deletion marker.

Reorganization
Purpose:

Periodic reorganization is needed to reclaim unused space from deleted

records.

During reorganization, file blocks are accessed sequentially, and

records are packed by removing deleted records.

Blocks are filled to capacity after reorganization.

Unit 6: File indexing and Transaction Processing 2

Alternative Use of Deleted Space:

Space of deleted records can be reused for new records, requiring

additional bookkeeping to track empty locations.

Record Structure
Spanned vs. Unspanned Organization:

Unordered files can use spanned (records can span multiple blocks) or
unspanned (records are confined to a single block) organization.

Applicable to both fixed-length and variable-length records.

Modification of Records:

Modifying a variable-length record may necessitate deletion and

reinsertion if the modified record does not fit in the original space.

Reading Records in Sorted Order

Creating a Sorted Copy:

Reading records in order of some field values requires creating a sorted

copy of the file.

Sorting large disk files is expensive and requires special external

sorting techniques.

Direct Access
Unordered Fixed-Length Records:

Using unspanned blocks and contiguous allocation, direct access by

record position is straightforward.

If records are numbered 0, 1, 2, ..., r − 1, and records in each block are
numbered 0, 1, ..., bfr − 1 (where bfr is the blocking factor), the i-th
record is located in block ⌊ bfi r ⌋and is the (i mod bfr)-th record in

that block.

Such files are often called relative or direct files due to direct access by
relative positions.

Access Paths:

Unit 6: File indexing and Transaction Processing 3

Direct access by position facilitates constructing access paths such as
indexes, improving search and retrieval efficiency.

16.7 Files of Ordered Records (Sorted Files)

Overview
Ordered (Sequential) File Definition:

Records are physically ordered on disk based on the values of an

ordering field.

If the ordering field is a key field (unique value in each record), it is

called the ordering key.

Often referred to as a sequential file when ordered by an ordering

key.

Advantages
Efficient Reading:

Reading records in order of the ordering key is efficient because no

sorting is required.

Facilitates efficient sequential access and search conditions like key

= value or range conditions such as value1 < key < value2 .

Next Record Access:

Finding the next record in order usually requires no additional block

accesses unless the current record is the last in the block.

Binary Search:

Binary search can be used on blocks rather than records, improving

search efficiency.

For a file with \( b \) blocks, a binary search accesses \( \log_2(b) \)

blocks on average, which is better than the linear search requiring \(
\frac{b}{2} \) blocks on average.

Search Efficiency
Range Search:

Unit 6: File indexing and Transaction Processing 4

Efficient due to the contiguous nature of records that satisfy the
search condition.

Random Access:

Provides no advantages for random or ordered access based on

nonordering fields.

Requires a linear search for random access.

Insertion and Deletion

Insertion:

Expensive as records must remain ordered.

Requires finding the correct position and shifting records, which is

time-consuming for large files.

On average, half the records must be moved, involving reading and

rewriting blocks.

Deletion:

Less severe if deletion markers and periodic reorganization are

used.

Efficiency Improvement:

Keep unused space in each block for new records.

Use an overflow or transaction file for new records, periodically

merging it with the main file to maintain order.

Record Modification
Nonordering Field:

Modified by changing and rewriting the record in its original

location.

Ordering Field:

Modification requires deletion of the old record and insertion of the

modified record due to position change.

Reading and Reorganization

Sequential Reading:

Unit 6: File indexing and Transaction Processing 5

Efficient if ignoring overflow records.

Double buffering can be used to read blocks consecutively.

Including Overflow:

Requires merging overflow records into the correct positions.

Periodic reorganization sorts and merges the overflow file with the
master file, removing records marked for deletion.

Use Cases and Limitations

Primary Index:

Ordered files are often used with an additional access path called a
primary index, resulting in an indexed-sequential file.

Improves random access time on the ordering key field.

Clustered File:

If the ordering attribute is not a key, the file is called a clustered file.

Summary
Access Time:

Average access time in block accesses for a specific record in a file

with \( b \) blocks.

Use in Databases:

Rarely used unless combined with a primary index.

Table 16.3: Summarizes average access times in block accesses to find a

specific record.

Key Points

Unit 6: File indexing and Transaction Processing 6

Ordered records provide efficient sequential and range-based access.

Binary search on blocks enhances search efficiency.

Insertions and deletions are costly operations due to the need to

maintain order.

Combining with primary indexes (indexed-sequential files) enhances

performance in database applications.

17.1 Types of Single-Level Ordered Indexes

Introduction to Ordered Indexes

Concept: Similar to a textbook index which lists important terms in
alphabetical order with page numbers for easy lookup.

Purpose: Allows efficient access to records using indexing fields

without scanning the entire file linearly.

Structure: Index stores values of the index field along with pointers to
disk blocks containing those values, allowing binary search on the
index.

Key Fields and Indexing

Indexing Field: Single field used to create an index, storing values in an
ordered manner for efficient searching.

Binary Search: Performed on the index, which is smaller than the data
file, enhancing search efficiency.

Multilevel Indexes: Extensions of binary search, partitioning search

space into n-ways for better performance.

Types of Ordered Indexes

1. Primary Index:

Defined on the ordering key field of an ordered file.

Unique values for the ordering key field ensure each record is
distinctly accessible.

Unit 6: File indexing and Transaction Processing 7

The index entry consists of the ordering key and a pointer to the
disk block containing the first record with that key value.

2. Clustering Index:

Used when the ordering field is not a key field (multiple records can
have the same value).

The file is clustered based on the ordering field.

Only one clustering index per file is allowed, like the primary index.

Unit 6: File indexing and Transaction Processing 8

Unit 6: File indexing and Transaction Processing 9
3. Secondary Index:

Defined on non-ordering fields.

Allows multiple indexes on different fields to enhance access

efficiency.

Can exist alongside a primary or clustering index.

Unit 6: File indexing and Transaction Processing 10

Primary Indexes
Structure:

Ordered file with fixed-length records, consisting of the primary key

and a pointer to a disk block.

Each entry corresponds to the first record in a block (anchor

record).

Index Entries:

<K(i), P(i)> where K(i) is the primary key and P(i) is the block
pointer.

Can point to the physical address, record address, or logical

address of the block or record.

Example:

Unit 6: File indexing and Transaction Processing 11

Index entry: <K(1) = "Aaron, Ed", P(1) = address of block 1>

Sparse vs. Dense Indexes:

Dense Index: Index entry for every search key value.

Sparse Index: Index entry for some search key values (primary
index is typically sparse).

Efficiency:

Smaller index size compared to the data file.

Requires fewer block accesses for search operations.

Example calculation shows the reduction in block accesses from 13

to 6 when using a primary index.

Insertion and Deletion Issues:

Insertion: Inserting a new record may require moving records and

updating index entries.

Overflow Handling: Use of unordered overflow files or linked lists

of overflow records to manage insertions and deletions.

Deletion: Handled using deletion markers.

Summary
Primary Index: Efficient for ordered files with unique key fields,
enhancing search performance but complex to manage for insertions
and deletions.

Clustering Index: Used for non-key ordered fields with potential for
multiple records sharing the same field value.

Secondary Index: Flexible access method for non-ordering fields,

supporting multiple indexes for varied query optimization.

Understanding the different types of single-level ordered indexes and their

applications helps in designing efficient database systems, balancing
search performance with manageability.

Unit 6: File indexing and Transaction Processing 12

17.2 Multilevel Indexes
Purpose: Reduce search space in an index more efficiently than a
binary search.

Concept:

Fan-out (fo): The blocking factor for the index, reducing the search
space faster than binary search (n-ways instead of 2-ways).

Example: With a 4,096-byte block size and entries (4-byte block

pointer + 9-byte key), the fan-out can be up to 315.

Structure:

First Level (Base Level): Index file with distinct values.

Second Level: Primary index for the first level, using block anchors
(one entry per block of the first level).

Subsequent Levels: Each level indexes the previous one, reducing

entries by a factor of fo.

Calculation:

Number of Blocks per Level:

First Level: \( r_1 \) entries.

Unit 6: File indexing and Transaction Processing 13

Second Level: \( r_2 = \lceil \frac{r_1}{fo} \rceil \).

Continue until the entries fit in a single block.

Number of Levels (t): \( t = \lceil \log_{fo} (r_1) \rceil \).

Advantages:

Fewer block accesses compared to binary search.

Example: Searching a multilevel index might require 4 block

accesses compared to 12 in a single-level binary search.

File Organization:

Common in business data processing (e.g., IBM's ISAM).

Indexed sequential file: ordered file with multilevel primary index.

Search Algorithm:

Algorithm 17.1: Searches a nondense multilevel primary index.

Read the top-level block.

Iterate through index levels, updating pointer (p) to next block

based on key comparison.

Finally, read the data file block and search for the record.

Dynamic Multilevel Index:

To handle insertions/deletions, dynamic multilevel indexes (e.g., B-

trees, B+-trees) leave space in blocks and use specific algorithms
for growing/shrinking the index.

20.1 Introduction to Transaction Processing

20.1.1 Single-User versus Multiuser Systems

Single-User DBMS:

Only one user can use the system at a time.

Typically found in personal computer systems.

Multiuser DBMS:

Allows many users to access the system and database

concurrently.

Unit 6: File indexing and Transaction Processing 14

Common in applications like airline reservations, banking, and
supermarkets.

Concurrency in Multiuser Systems:

Achieved through multiprogramming, allowing multiple processes to

execute simultaneously.

Processes are interleaved by the operating system to keep the CPU

busy.

Multiple hardware processors enable parallel processing.

Database Access in Multiuser Systems:

Users or programs constantly access and modify database items

concurrently.

20.1.2 Transactions, Database Items, Read and Write Operations, and

DBMS Buffers

Transaction:

Logical unit of database processing.

Includes read, write, insert, delete, or retrieval operations.

Database Items:

Represented as named data items.

Can be of varying granularity, from a whole disk block to a single

attribute value.

Read and Write Operations:

read_item(X) : Reads database item X into a program variable.

: Writes the value of program variable

write_item(X) X into the
database item X .

Database Buffer Management:

Buffers in main memory hold disk pages.

Buffer replacement policies determine which buffers are replaced

when new disk pages are needed.

20.1.3 Why Concurrency Control Is Needed

Unit 6: File indexing and Transaction Processing 15

Concurrency Problems:

Lost Update: Interleaved operations overwrite changes made by

another transaction.

Temporary Update (Dirty Read): Transaction reads uncommitted

changes made by another transaction.

Incorrect Summary: Aggregate functions may calculate incorrect

values due to interleaved updates.

Unrepeatable Read: A transaction reads the same item twice and

receives different values due to updates by another transaction.

20.1.4 Why Recovery Is Needed

Types of Failures:

Transaction Failures: Computer/system errors, transaction errors,

concurrency control enforcement.

Disk Failures: Read/write malfunctions, head crashes.

Physical Problems and Catastrophes: Power failure, fire, theft,

sabotage.

Unit 6: File indexing and Transaction Processing 16

Importance of Recovery:

System must recover from failures to ensure database integrity and

consistency.

Recovery techniques are crucial for maintaining database reliability.

20.2 Transaction and System Concepts

20.2.1 Transaction States and Additional Operations

Transaction: An atomic unit of work, either fully completed or not done
at all.

Key Operations:

BEGIN_TRANSACTION: Marks the start of a transaction.

READ/WRITE: Operations on database items within the transaction.

END_TRANSACTION: Marks the end of transaction operations;

checks for commit or abort.

COMMIT_TRANSACTION: Successfully ends the transaction,

making changes permanent.

ROLLBACK/ABORT: Ends the transaction unsuccessfully, undoing

any changes.

Transaction States:

Active: Transaction starts execution and performs READ/WRITE

operations.

Partially Committed: Transaction has ended, but final checks (e.g.,

for serializability) are pending.

Committed: Transaction successfully completes, and changes are

made permanent.

Failed: Transaction encounters an error and must be rolled back.

Terminated: Transaction finishes, and system information is

removed. Failed transactions may restart.

State Transition Diagram:

States: Active, Partially Committed, Committed, Failed, Terminated.

Unit 6: File indexing and Transaction Processing 17

Transitions: Begin, End, Commit, Abort, Read/Write operations.

20.2.2 The System Log

Purpose: Track transaction operations and database changes for
recovery.

Log Characteristics:

Sequential, append-only file on disk.

Log buffers in main memory hold recent entries, written to disk

when filled or needed.

Periodic backups to archival storage.

Log Records:

[start_transaction, T]: Marks the start of transaction T.

[write_item, T, X, old_value, new_value]: Records change of item X

by transaction T.

[read_item, T, X]: Records read of item X by transaction T.

[commit, T]: Marks the successful completion of transaction T.

[abort, T]: Marks the abortion of transaction T.

Recovery:

Undo changes by tracing backward through log records.

Redo operations if a transaction’s updates are recorded but not yet

applied to the database.

20.2.3 Commit Point of a Transaction

Unit 6: File indexing and Transaction Processing 18

Commit Point: Transaction reaches this point when all operations are
successfully executed, and effects are logged.

Commit Process:

Write a commit record [commit, T] to the log.

Force-write log buffer to disk to ensure all log entries are saved.

Ensure transaction effects are permanently recorded in the

database.

Recovery from Failure:

Transactions without a commit record are rolled back.

Transactions with a commit record have their changes redone if

necessary.

20.2.4 DBMS-Specific Buffer Replacement Policies

Purpose: Efficiently manage disk pages in main memory buffers.

Policies:

Domain Separation (DS): Separate buffers for different disk page

types (index, data, log). Basic LRU used within domains.

Hot Set Method: Keeps frequently accessed pages in memory

during specific operations (e.g., nested-loop joins).

DBMIN Method: Uses Query Locality Set Model (QLSM) to predict

page references and allocate buffers accordingly.

Implementation:

DS achieves better performance on average, but is static.

Variations like GRU add dynamic load-balancing.

DBMIN allocates buffers based on the estimated needs for each file
instance in a query.

20.3 Desirable Properties of Transactions

In database management, transactions should possess several properties
collectively known as the ACID properties. These properties ensure that
transactions are processed reliably and that the integrity of the database is
maintained. The ACID properties are:

Unit 6: File indexing and Transaction Processing 19

1. Atomicity:

A transaction is an indivisible unit of processing. It should be

executed in its entirety or not executed at all.

Responsibility: Transaction recovery subsystem of a DBMS.

If a transaction fails (e.g., due to a system crash), the recovery

technique must undo any effects of the transaction on the database.
Conversely, write operations of a committed transaction must
eventually be written to disk.

2. Consistency Preservation:

A transaction should transform the database from one consistent

state to another, ensuring that integrity constraints are not violated.

Responsibility: Database programmers and the DBMS module

enforcing integrity constraints.

Definition: A consistent database state satisfies all specified

constraints and conditions. If the database is consistent before a
transaction starts, it should remain consistent after the transaction
completes, provided no interference from other transactions occurs.

3. Isolation:

Transactions should appear to execute in isolation from each other,

even though they may be executing concurrently.

Responsibility: Concurrency control subsystem of the DBMS.

Explanation: Isolation ensures that the execution of a transaction is

not affected by other concurrently executing transactions. One
method to enforce isolation is to prevent a transaction's updates
from being visible to others until it is committed, solving the
temporary update problem and eliminating cascading rollbacks.

4. Durability (Permanency):

Changes applied to the database by a committed transaction must

persist, meaning they must not be lost due to any failure.

Responsibility: Recovery subsystem of the DBMS.

Explanation: Durability ensures that once a transaction commits, its

changes are permanent and will survive subsequent system

Unit 6: File indexing and Transaction Processing 20

failures.

Levels of Isolation
Isolation levels define the degree to which a transaction must be isolated
from the data modifications made by other transactions. The isolation
levels, from least isolated to most isolated, include:

Level 0 Isolation:

The transaction does not overwrite dirty reads of higher-level

transactions.

Level 1 Isolation:

Prevents lost updates.

Level 2 Isolation:

Prevents lost updates and dirty reads.

Level 3 Isolation (True Isolation):

Ensures repeatable reads in addition to preventing lost updates and

dirty reads. This means that if a transaction reads a data item, it will
read the same value in subsequent reads within the same
transaction, ensuring that no other transaction can modify the data
item during this period.

Snapshot Isolation:

Based on the idea of providing each transaction with a "snapshot"

of the database at a particular point in time, ensuring that it
operates on a consistent set of data. This approach is discussed in
detail in Section 20.6 and again in Chapter 21, Section 21.4.

Summary
The ACID properties—Atomicity, Consistency, Isolation, and Durability—are
critical for ensuring the reliability and integrity of database transactions. By
enforcing these properties, the DBMS can handle concurrent transactions,
maintain consistency, and recover from failures while ensuring that the
database remains in a valid state. Different levels of isolation offer varying
degrees of transaction isolation, balancing between performance and the
strictness of isolation.

Unit 6: File indexing and Transaction Processing 21

Chapter 5. Record Storage and Primary File Organization
No ratings yet
Chapter 5. Record Storage and Primary File Organization
18 pages
09 FIle
No ratings yet
09 FIle
22 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
Lec 5DB
No ratings yet
Lec 5DB
40 pages
DS TM Study Material Presentations Unit-4 1TM
No ratings yet
DS TM Study Material Presentations Unit-4 1TM
22 pages
File Organization
No ratings yet
File Organization
11 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
DBMS Unit5
No ratings yet
DBMS Unit5
25 pages
Unit 1 Lecture 9
No ratings yet
Unit 1 Lecture 9
22 pages
Heap File Organization Pros and Cons
No ratings yet
Heap File Organization Pros and Cons
204 pages
SelfStudy - Chapter 10, 11 - File Structure, Indexing and Hashing
No ratings yet
SelfStudy - Chapter 10, 11 - File Structure, Indexing and Hashing
33 pages
Chapter - 8 1 97
No ratings yet
Chapter - 8 1 97
97 pages
File Organization Techniques in DSA
50% (4)
File Organization Techniques in DSA
14 pages
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
No ratings yet
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
23 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
48 pages
File Organization & Indexing Guide
No ratings yet
File Organization & Indexing Guide
24 pages
Explain File Management in An Operating System
No ratings yet
Explain File Management in An Operating System
57 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
TOPIC THREE-File System
No ratings yet
TOPIC THREE-File System
15 pages
File Management in Operating Systems
No ratings yet
File Management in Operating Systems
40 pages
Ds Mod 5
No ratings yet
Ds Mod 5
17 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
Data Storage and Query Processing Techniques
No ratings yet
Data Storage and Query Processing Techniques
81 pages
File Organization Midterm
No ratings yet
File Organization Midterm
43 pages
Indexing
No ratings yet
Indexing
62 pages
UNIT-6 Important Questions & Answers
No ratings yet
UNIT-6 Important Questions & Answers
20 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
Chapter 1
No ratings yet
Chapter 1
29 pages
DBMS File Organization
No ratings yet
DBMS File Organization
69 pages
Unit V Dbms Question and Answer
No ratings yet
Unit V Dbms Question and Answer
9 pages
Database File Organisation Lecture
No ratings yet
Database File Organisation Lecture
32 pages
Dbms 5
No ratings yet
Dbms 5
26 pages
IGNOU BCA CS-06 2012 Solved Assignment
No ratings yet
IGNOU BCA CS-06 2012 Solved Assignment
10 pages
Dbms Notes - Unit 5
No ratings yet
Dbms Notes - Unit 5
21 pages
Reorganizing Indexed Sequential Files
No ratings yet
Reorganizing Indexed Sequential Files
77 pages
Lecture 4.Pptx 2
No ratings yet
Lecture 4.Pptx 2
15 pages
Unit 5 Dbms
No ratings yet
Unit 5 Dbms
12 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
13 pages
Disk Storage and File Structures Overview
No ratings yet
Disk Storage and File Structures Overview
34 pages
Disk Storage, Basic File Structures, and Hashing: Database Design Database Design
No ratings yet
Disk Storage, Basic File Structures, and Hashing: Database Design Database Design
13 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
53 pages
CH 3 Index
No ratings yet
CH 3 Index
40 pages
Indexing vs Hashing in DBMS
No ratings yet
Indexing vs Hashing in DBMS
31 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
Internal File Structure: Methods and Design Paradigm
No ratings yet
Internal File Structure: Methods and Design Paradigm
6 pages
File Organization
No ratings yet
File Organization
9 pages
DBMS Unit-5 Notes
No ratings yet
DBMS Unit-5 Notes
23 pages
Chapter 5
No ratings yet
Chapter 5
28 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
DBMS - R2017 - Anna University
No ratings yet
DBMS - R2017 - Anna University
20 pages
Unit 6
No ratings yet
Unit 6
20 pages
SQL Constraints Explained
No ratings yet
SQL Constraints Explained
5 pages
2022 3 60 217 - Lab02
No ratings yet
2022 3 60 217 - Lab02
10 pages
Power BI Security Whitepaper
No ratings yet
Power BI Security Whitepaper
27 pages
SQL Guide for Beginners
No ratings yet
SQL Guide for Beginners
77 pages
File Management & Backup Guide
No ratings yet
File Management & Backup Guide
3 pages
Salesforce Resilience Best Practices Guide WP
No ratings yet
Salesforce Resilience Best Practices Guide WP
14 pages
Sas Access PDF
No ratings yet
Sas Access PDF
984 pages
Practical SQL Guide For Relational Databases: January 2016
No ratings yet
Practical SQL Guide For Relational Databases: January 2016
76 pages
MCS 221
No ratings yet
MCS 221
3 pages
Data Analysis Power Bi Classnotes
No ratings yet
Data Analysis Power Bi Classnotes
4 pages
Marc of Quality
No ratings yet
Marc of Quality
152 pages
Power Bi Boring Course
No ratings yet
Power Bi Boring Course
71 pages
Aakash
0% (1)
Aakash
7 pages
NoSQL and Distributed Computing
No ratings yet
NoSQL and Distributed Computing
36 pages
Bank DBMS SQL Code With Output
No ratings yet
Bank DBMS SQL Code With Output
2 pages
Introduction to MS Access Basics
No ratings yet
Introduction to MS Access Basics
12 pages
Oracle SQL Grouping Guide
No ratings yet
Oracle SQL Grouping Guide
16 pages
Chapter 1
No ratings yet
Chapter 1
46 pages
Question and Answer
No ratings yet
Question and Answer
6 pages
The Ultimate Tableau Workbook v0.4 - Compressed
No ratings yet
The Ultimate Tableau Workbook v0.4 - Compressed
214 pages
JPA vs Hibernate: Key Differences & Setup
No ratings yet
JPA vs Hibernate: Key Differences & Setup
49 pages
United States: (12) Patent Application Publication (10) Pub. No.: US 2013/0179474 A1
No ratings yet
United States: (12) Patent Application Publication (10) Pub. No.: US 2013/0179474 A1
7 pages
Data Structures and Algorithm Complexity
No ratings yet
Data Structures and Algorithm Complexity
17 pages
Computer Science Manual For 2nd PUC, Karnataka Board
83% (98)
Computer Science Manual For 2nd PUC, Karnataka Board
48 pages
Commands in MySQL
No ratings yet
Commands in MySQL
2 pages
Ashutosh Kumar CV
No ratings yet
Ashutosh Kumar CV
1 page
Jatain Barangay Voter List 2023
No ratings yet
Jatain Barangay Voter List 2023
56 pages
NoSQL Databases Overview
No ratings yet
NoSQL Databases Overview
8 pages
Update SQL in Access with VBA
No ratings yet
Update SQL in Access with VBA
3 pages
Oracle EBS PROD to DEV Cloning Guide
No ratings yet
Oracle EBS PROD to DEV Cloning Guide
6 pages

Transaction File Organization and Indexing

Uploaded by

Transaction File Organization and Indexing

Uploaded by

💻

Unit 6: File indexing and

New records are added at the end of the file.

Also known as a heap or pile file.

Very efficient due to the simplicity of appending records.

Unit 6: File indexing and Transaction Processing 1

Searching for Records

Involves a linear search through the file block by block, making it an

For a file with \( b \) blocks, this requires searching 2b ﻿ blocks on

If no records or multiple records satisfy the search condition, all \( b \)

An alternative method involves using an extra byte or bit (deletion

A specific value of the marker indicates a deleted record.

Search programs only consider records without the deletion marker.

Periodic reorganization is needed to reclaim unused space from deleted

During reorganization, file blocks are accessed sequentially, and

Blocks are filled to capacity after reorganization.

Unit 6: File indexing and Transaction Processing 2

Space of deleted records can be reused for new records, requiring

Applicable to both fixed-length and variable-length records.

Modifying a variable-length record may necessitate deletion and

Reading Records in Sorted Order

Reading records in order of some field values requires creating a sorted

Sorting large disk files is expensive and requires special external

Using unspanned blocks and contiguous allocation, direct access by

Unit 6: File indexing and Transaction Processing 3

16.7 Files of Ordered Records (Sorted Files)

Records are physically ordered on disk based on the values of an

If the ordering field is a key field (unique value in each record), it is

Often referred to as a sequential file when ordered by an ordering

Reading records in order of the ordering key is efficient because no

Facilitates efficient sequential access and search conditions like key

= value or range conditions such as value1 < key < value2 .

Next Record Access:

Finding the next record in order usually requires no additional block

Binary search can be used on blocks rather than records, improving

For a file with \( b \) blocks, a binary search accesses \( \log_2(b) \)

Unit 6: File indexing and Transaction Processing 4

Provides no advantages for random or ordered access based on

Requires a linear search for random access.

Insertion and Deletion

Expensive as records must remain ordered.

Requires finding the correct position and shifting records, which is

On average, half the records must be moved, involving reading and

Less severe if deletion markers and periodic reorganization are

Keep unused space in each block for new records.

Use an overflow or transaction file for new records, periodically

Modified by changing and rewriting the record in its original

Modification requires deletion of the old record and insertion of the

Reading and Reorganization

Unit 6: File indexing and Transaction Processing 5

Double buffering can be used to read blocks consecutively.

Requires merging overflow records into the correct positions.

Use Cases and Limitations

Improves random access time on the ordering key field.

Average access time in block accesses for a specific record in a file

Rarely used unless combined with a primary index.

Table 16.3: Summarizes average access times in block accesses to find a

Unit 6: File indexing and Transaction Processing 6

Binary search on blocks enhances search efficiency.

Insertions and deletions are costly operations due to the need to

Combining with primary indexes (indexed-sequential files) enhances

17.1 Types of Single-Level Ordered Indexes

Introduction to Ordered Indexes

Purpose: Allows efficient access to records using indexing fields

Key Fields and Indexing

Multilevel Indexes: Extensions of binary search, partitioning search

Types of Ordered Indexes

Defined on the ordering key field of an ordered file.

Unit 6: File indexing and Transaction Processing 7

The file is clustered based on the ordering field.

Unit 6: File indexing and Transaction Processing 8

Defined on non-ordering fields.

Allows multiple indexes on different fields to enhance access

Can exist alongside a primary or clustering index.

Unit 6: File indexing and Transaction Processing 10

Ordered file with fixed-length records, consisting of the primary key

For a file with \( b \) blocks, this requires searching 2b blocks on