0% found this document useful (0 votes)

20 views39 pages

8 Query Optimization

The document covers disk storage and indexing in databases, detailing the types of storage, file organizations, and indexing methods such as primary, clustering, and secondary indexes. It emphasizes the importance of indexing for efficient data retrieval and query optimization, which reduces resource usage and improves performance. Additionally, it discusses query optimization techniques, including cost-based and rule-based optimization, to enhance database query execution strategies.

Uploaded by

hudaaghazi22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views39 pages

8 Query Optimization

Uploaded by

hudaaghazi22

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

University of Salahaddin

College of Engineering
Software & Informatics Engineering Department

DISK STORAGE AND

INDEXING
2ND STAGE
Lecturer:
Dr. Hanan Kamal
2

Content
• Introduction
• Storage of Database
• Files
• Types of Single-Level Ordered Indexes
• Primary Indexes
• Clustering Indexes
• Secondary Indexes
3

Introduction
• Computerized database → (physically) computer storage
medium.
• Computer storage media form a storage hierarchy that
includes two main categories:
1. Primary storage
2. Secondary and tertiary storage

• Usually Primary :
• Provides fast access to data but is of limited storage capacity.
• More expensive and have less storage capacity than secondary
and tertiary storage devices.
4

Storage of Databases
• Databases typically store large amounts of data that must
persist over long periods of time, and hence is often
referred to as persistent data.
• Transient data Parts of data are accessed and processed
repeatedly during this period.
• Databases are stored permanently (or persistently) on
magnetic disk, for the following reasons:
• Size
• losses
• The cost
5

Storage of Databases (cont.)

• The data stored on disk is organized as files of records.
• Each record is a collection of data values that can be
interpreted as facts about entities, their attributes, and
their relationships.
• There are several primary file organizations
• heap file (or unordered file)
• sorted file (or sequential file) (sort key).
• hashed file (hash key)
• B-trees
6

Files
Unordered (heap) files Ordered (Sorted) Files

1. New records are 1. File records are kept

inserted at the end of sorted .
the file. 2. A binary search can be
2. To search for a record, used
a linear search . 3. Insertion is expensive:
3. Record insertion is records must be
quite efficient. inserted in the correct
4. Reading the records in order.
order of a particular 4. Reading the records in
field requires sorting order of the ordering
the file records. field is quite efficient.
7

• Some blocks of
an ordered
(sequential) file
of EMPLOYEE
records with
Name as the
ordering key
field.
What is Indexing?

• A data structure to quickly locate records without scanning

the entire table.
• Acts like a book index for faster access.
• Improves read/query performance significantly.
Why Indexing Matters

• Reduces I/O cost of query execution.

• Improves performance for SELECT, JOIN, and ORDER
BY.
• Supports constraints like UNIQUE and PRIMARY KEY.
10

Types of Single-Level Ordered Indexes

• Indexing idea (book index)
• An index access structure is usually defined on a single field of
a file, called an indexing field (or indexing attribute).

• Indexes can also be characterized as dense or

sparse.
• A dense index has an index entry for every search
key value (and hence every record) in the data file.

• A sparse (or nondense) index, has index entries

for only some of the search values.
11

Primary Indexes
• A primary index is an ordered file whose records are of
fixed length with two fields
• Index entry (or index record) in the index file, will refer to
the two field values of index entry i as <K(i), P(i)>.

• K(i) is of t he same data type as the ordering

key Field called the primary key of the data file
• P(i) is a pointer to a disk block (a block
address).
12
13

Clustering Indexes
• If file records are physically ordered on a nonkey field
which does not have a distinct value for each record that
field is called the clustering field and the data file is
called a clustered file.

• A clustering index is also an ordered file with two

fields:
1. field is of the same type as the clustering field of the
data file
2. disk block pointer.
14
15

Secondary Indexes
• A secondary index provides a secondary means of
accessing a data file for which some primary access
already exists.
• The data file records could be ordered, unordered.

• The secondary index may be created on:

1. Candidate key and has a unique value in every record
2. Nonkey field with duplicate values.
16

Secondary index on a key, nonordering field of a

file.
17

Secondary index on a nonkey, nonordering field

of a file.
18

Summary of Indexes
19

Thank you
20

1. What is indexing?
2. Talk about
[Link] Indexes
[Link] Indexes
[Link] Indexes
University of Salahaddin
College of Engineering
Software & Informatics Engineering Department

QUERY OPTIMIZATION
ND
2 STAGE

Lecturer:
Dr. Hanan Kamal
22

Introduction
• In network and hierarchical DBMSs, low-level
procedural query language is generally
embedded in high-level programming language.
• Programmer’s responsibility to select most
appropriate execution strategy.
• With declarative languages such as SQL, user
specifies what data is required rather than how it
is to be retrieved.
• Relieves user of knowing what constitutes good
execution strategy.
23

Introduction
• Also gives DBMS more control over system
performance.

• Two main techniques for query

optimization:
• Heuristic rules that order operations in a query;
• Comparing different strategies based on relative
costs, and selecting one that minimizes
resource usage.

• Disk access tends to be dominant cost in

query processing for centralized DBMS.
24

Query Processing

Activities involved in retrieving data from

the database.

• Aims of QP:
• Transform query written in high-level language
(e.g. SQL), into correct and efficient execution
strategy expressed in low-level language
(implementing RA);
• Execute strategy to retrieve required data.
Introduction to Query Optimization
• Query optimization (QO) is the process of
choosing the most efficient way to execute a SQL
query.

• The importance of QO:

- Reduces response time
- Minimizes resource usage
- Essential for large-scale databases
26

Query Optimization
• A query optimizer evaluates multiple query execution
plans and chooses the best one based on cost
estimates.
• As there are many equivalent transformations of same
high-level query, aim of QO is to choose one that
minimizes resource usage.
• Generally, reduce total execution time of query.
• May also reduce response time of query.
• Problem computationally intractable with large number of
relations, so strategy adopted is reduced to finding near
optimum solution.
Goals of Query Optimization
- Minimize I/O cost
- Reduce CPU usage
- Ensure fastest response time
- Avoid unnecessary computations
Cost-Based Optimization
This method evaluates alternative plans using a
cost model based on:
- Disk I/O
- CPU cost
- Network latency
Rule-Based Optimization
Uses heuristics (rules) to transform queries:
- Push selections early
- Combine projections
- Join order optimization
30

Example 21.1 - Different Strategies

Find all Managers who work at a London

branch.

SELECT *
FROM Staff s, Branch b
WHERE [Link] = [Link] AND
([Link] = ‘Manager’ AND [Link] = ‘London’);
31

Example 21.1 - Different Strategies

• Three equivalent RA queries are:
(1) (position='Manager')  (city='London') 
([Link]=[Link]) (Staff X Branch)
(2) (position='Manager')  (city='London')(
Staff [Link]=[Link] Branch)
(3) (position='Manager'(Staff))
[Link]=[Link]
(city='London' (Branch))
Join Optimization
Join operations can be expensive. Types of join
algorithms:
- Nested Loop Join
- Hash Join
- Merge Join
Subquery Optimization
• Rewrite subqueries as joins or EXISTS to
improve performance.
View Materialization
• Views can be materialized (stored) or
virtual (computed per query).
Materialization improves performance at
the cost of storage.
Statistics for Optimization
Database optimizers use table statistics:
- Row counts
- Value distributions
- Index usage
Query Rewriting Techniques
- Remove redundant conditions
- Simplify expressions
- Use EXISTS instead of IN
37

Phases of Query Processing

Summary
✔ Query optimization is crucial for efficient
database systems
✔ Use indexes, rewrite queries, and analyze
execution plans
✔ Know your data and schema
39

Thank you

7-Indexing and Block
No ratings yet
7-Indexing and Block
20 pages
Types of Indexing Methods Explained
No ratings yet
Types of Indexing Methods Explained
60 pages
Indexing
No ratings yet
Indexing
62 pages
Advanced SQL Architecture and Querying
No ratings yet
Advanced SQL Architecture and Querying
60 pages
Lec6 QP Indexing
No ratings yet
Lec6 QP Indexing
40 pages
Indexing vs. Hashing in DBMS
No ratings yet
Indexing vs. Hashing in DBMS
24 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
Lec20Indexing v1
No ratings yet
Lec20Indexing v1
57 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Unit-4 DBMS Merged
No ratings yet
Unit-4 DBMS Merged
156 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
Index & Query Optimization
No ratings yet
Index & Query Optimization
21 pages
W5 Storage Files Indexing pt1
No ratings yet
W5 Storage Files Indexing pt1
61 pages
Indexing in Dbms
No ratings yet
Indexing in Dbms
19 pages
Chapter - 2 - Revision
No ratings yet
Chapter - 2 - Revision
26 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
I/O Cost in Query Optimization
No ratings yet
I/O Cost in Query Optimization
9 pages
Database Storage & Indexing Guide
No ratings yet
Database Storage & Indexing Guide
41 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
22 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Index 1
No ratings yet
Index 1
25 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
58 pages
Query Processing Query Optimization
No ratings yet
Query Processing Query Optimization
4 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
28 pages
11.2 Indexing
No ratings yet
11.2 Indexing
26 pages
File Storage and Indexing: Lesson 13 Cs 3200 Kathleen Durant PHD
No ratings yet
File Storage and Indexing: Lesson 13 Cs 3200 Kathleen Durant PHD
46 pages
Databases LEVEL 3 Notes
No ratings yet
Databases LEVEL 3 Notes
29 pages
Database Management System-203105251: Assistant Professor Computer Science & Engineering
No ratings yet
Database Management System-203105251: Assistant Professor Computer Science & Engineering
35 pages
Query Optimization in Databases
No ratings yet
Query Optimization in Databases
6 pages
Query Processing, Optimization, and Indexing Techniques
No ratings yet
Query Processing, Optimization, and Indexing Techniques
29 pages
DBMS Unit-5 Notes
No ratings yet
DBMS Unit-5 Notes
23 pages
Query Processing and Query Optimization Techniques
No ratings yet
Query Processing and Query Optimization Techniques
20 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
File Structure and Indexing
No ratings yet
File Structure and Indexing
7 pages
B-Tree Indexing in Database Management
No ratings yet
B-Tree Indexing in Database Management
52 pages
Unit-6 Storage Strategies
No ratings yet
Unit-6 Storage Strategies
43 pages
Indexing Lecture Nov 2023 Detailed
No ratings yet
Indexing Lecture Nov 2023 Detailed
37 pages
Reorganizing Indexed Sequential Files
No ratings yet
Reorganizing Indexed Sequential Files
77 pages
Database
No ratings yet
Database
4 pages
Indexing Lecture Nov 2023 Summary
No ratings yet
Indexing Lecture Nov 2023 Summary
41 pages
Converted Text
No ratings yet
Converted Text
5 pages
Class 6
No ratings yet
Class 6
15 pages
Types of Oracle Indexes Explained
100% (1)
Types of Oracle Indexes Explained
27 pages
UEU Basis Data Pertemuan 14
No ratings yet
UEU Basis Data Pertemuan 14
32 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
File Organization and Storage Access Guide
No ratings yet
File Organization and Storage Access Guide
185 pages
CH 3 Index
No ratings yet
CH 3 Index
40 pages
Database Storage and Indexing
No ratings yet
Database Storage and Indexing
14 pages
Layers of A DBMS
No ratings yet
Layers of A DBMS
38 pages
It212 Lecture 7
No ratings yet
It212 Lecture 7
9 pages
Tuning
100% (2)
Tuning
29 pages
Comptia Network n10 008 Exam Cram 1
No ratings yet
Comptia Network n10 008 Exam Cram 1
50 pages
Class 10 Computer Exam Prep
No ratings yet
Class 10 Computer Exam Prep
2 pages
2660021156446B Users - Manual 8.0 - 500 - 5000
No ratings yet
2660021156446B Users - Manual 8.0 - 500 - 5000
406 pages
Automatic Traffic Counter Overview
No ratings yet
Automatic Traffic Counter Overview
2 pages
Understanding Transistor Functionality
No ratings yet
Understanding Transistor Functionality
27 pages
Pipeline Monitoring: Fiber-Optic Protection System
No ratings yet
Pipeline Monitoring: Fiber-Optic Protection System
8 pages
LCX 6 Load Sensing Sectional Control Valves Spare Parts List 1000493933 en
No ratings yet
LCX 6 Load Sensing Sectional Control Valves Spare Parts List 1000493933 en
44 pages
III B.Tech I Sem MachineLearning (20AD5T04)
No ratings yet
III B.Tech I Sem MachineLearning (20AD5T04)
1 page
Excel Formula and Chart Analysis Quiz
No ratings yet
Excel Formula and Chart Analysis Quiz
26 pages
9.GSM Speech & Channel Coding
93% (15)
9.GSM Speech & Channel Coding
29 pages
S-Tec 40, 50 Maintenance Manual
No ratings yet
S-Tec 40, 50 Maintenance Manual
75 pages
Boiler Tube Failure Solutions
100% (1)
Boiler Tube Failure Solutions
27 pages
3D Cube Model Specifications and Details
No ratings yet
3D Cube Model Specifications and Details
7 pages
DS 20250530 SG4400UD-MV-US Datasheet V28 EN NA
No ratings yet
DS 20250530 SG4400UD-MV-US Datasheet V28 EN NA
2 pages
Solemne 2
No ratings yet
Solemne 2
13 pages
Lab 12, 13
No ratings yet
Lab 12, 13
13 pages
B2 Trainee Resume and Cover Letter
No ratings yet
B2 Trainee Resume and Cover Letter
3 pages
Java Inheritance and Abstract Class Assignment
No ratings yet
Java Inheritance and Abstract Class Assignment
12 pages
Tail Shaft Bearing and Alignment Rules
No ratings yet
Tail Shaft Bearing and Alignment Rules
6 pages
Piping Progress & Welding Metrics
100% (1)
Piping Progress & Welding Metrics
8 pages
Fingerprint Door Lock Using Arduino: Technical Report
50% (2)
Fingerprint Door Lock Using Arduino: Technical Report
54 pages
TCS NQT First Level Shortlisted - Sairam
No ratings yet
TCS NQT First Level Shortlisted - Sairam
12 pages
IT's Role in Business Transformation
No ratings yet
IT's Role in Business Transformation
104 pages
BXC-V250 310M II Manual
No ratings yet
BXC-V250 310M II Manual
14 pages
HolidayHomework XII CS LSPAL
No ratings yet
HolidayHomework XII CS LSPAL
4 pages
Reading Final Project
No ratings yet
Reading Final Project
5 pages
Yolov 8
No ratings yet
Yolov 8
43 pages
LE1902x
No ratings yet
LE1902x
9 pages
Protection Layers Sis
No ratings yet
Protection Layers Sis
18 pages
Paper 17 Modern Regularization Methods For Inverse Problems
No ratings yet
Paper 17 Modern Regularization Methods For Inverse Problems
97 pages

8 Query Optimization

Uploaded by

8 Query Optimization

Uploaded by

University of Salahaddin

DISK STORAGE AND

Storage of Databases (cont.)

1. New records are 1. File records are kept

• A data structure to quickly locate records without scanning

• Reduces I/O cost of query execution.

Types of Single-Level Ordered Indexes

• Indexes can also be characterized as dense or

• A sparse (or nondense) index, has index entries

• K(i) is of t he same data type as the ordering

• A clustering index is also an ordered file with two

• The secondary index may be created on:

Secondary index on a key, nonordering field of a

Secondary index on a nonkey, nonordering field

• Two main techniques for query

• Disk access tends to be dominant cost in

Activities involved in retrieving data from

• The importance of QO:

Example 21.1 - Different Strategies

Find all Managers who work at a London

Example 21.1 - Different Strategies

Phases of Query Processing

You might also like