Optimization of Queries: An Introduction
One of the most important elements in the database optimization process is enhancing the
performance of SQL queries, and this is achieved through query optimization. This area of study
selection focuses on the distinct methodologies that can be used to execute a SQL query like
optimization on indexation, execution plans, structure of the query and lots more. A query that is not
well optimized will invariably result in slow performance, excessive use of CPU and memory
resources, and ineffectiveness of the system. This guide presents the most relevant methods for
optimizing queries in order to achieve the best possible performance from a given database system.
Improves Performance – Faster query execution reduces response time.
Reduces Resource Usage – Minimizes CPU, memory, and disk I/O consumption.
Enhances Scalability – Supports larger datasets and high transaction volumes.
Prevents System Overload – Avoids slow performance and downtime.
Lowers Costs – Reduces the need for expensive hardware and cloud resources.
Boosts Reporting & Analytics – Enables quick data retrieval for decision-making.
Improves User Experience – Ensures smooth and responsive applications.
Ensures Consistency – Delivers stable and reliable performance.
Execution Plans: Interpreting and Understanding
A database execution plan, found in systems like SQL Server, PostgreSQL, and MySQL, serves as a
guide that the database engine uses to carry out a query. It offers an understanding of the processing
of a query, including the operations carried out and their effectiveness. Grasping execution plans
contributes to optimizing queries, enhancing performance and identifies bottlenecks.
1. Types of Execution Plans
Execution plans fall into two distinct categories:
1. Estimated Execution Plan
It is generated without running the query.
Demonstrates the method by which the database engine plans to carry out the query.
Aids in enhancing query efficiency while maintaining performance levels.
Utilized for assessing costly processes prior to their implementation.
1. Actual Execution Plan
It is generated after execution of the query.
Features actual counts of rows, time taken for execution, and statistics regarding runtime.
Identify real obstacles to performance.
More precise than the projected strategy
Reading an Execution Plans:
Reading an execution plan involves looking at the sequence of actions that the database
engine takes to carry out a query. It encompasses grasping the various functions carried out,
including scans, seeks, joins, and sorts.
When reading an execution plan, pay attention to these aspects:
Execution Order: The execution of the query occurs from the right to the left, leading to the
rightmost operator being executed initially.
Operators: Each type of operation, such as Table Scan, Index Seek, Sort, and Join, is depicted
as a box.
Data Flow and Arrows: The arrows connecting different operations indicate the transfer of
data, with thicker arrows signifying a greater volume of data being handled.
Estimated versus Actual Rows: Determine whether the actual rows processed significantly
exceed the estimated count, as this could suggest an ineffective execution plan.
Interpreting an Execution Plans:
Interpreting an execution plan involves examining the details of each operation to identify
performance issues and areas for optimization. It assists in determining whether to add
indexes, modify queries, or enhance the database design.
Once you've read the execution plan, analyse it to find inefficiencies:
High-cost operators: Look for operations with a high-cost percentage.
Scans vs Seeks: Index Seek is efficient, while Table Scan is slow—consider adding an index if a
scan is used.
Joins Performance: Nested Loops are efficient for small data, but Hash Joins may be better
for large datasets.
Sorts and Aggregations: Sorting large datasets is expensive—use indexing to optimize ORDER
BY queries.
Let's go step by step with a hands-on demonstration using SQL Server Management Studio (SSMS).
🔹 Step 1: Enable Execution Plans
Before running queries, enable execution plans in SSMS:
Press Ctrl + M or
Go to Query → Include Actual Execution Plan
This will allow you to view how SQL Server executes the queries.
🔹 Step 2: Create a Sample Table and Insert Data
Let's create a table and insert some sample records.
sql
CopyEdit
-- Create a sample Orders table
CREATE TABLE Orders (
OrderID INT PRIMARY KEY IDENTITY(1,1),
CustomerID INT,
OrderDate DATETIME,
TotalAmount DECIMAL(10,2)
);
-- Insert sample data
INSERT INTO Orders (CustomerID, OrderDate, TotalAmount)
SELECT TOP 100000
ABS(CHECKSUM(NEWID())) % 1000,
DATEADD(DAY, -ABS(CHECKSUM(NEWID())) % 365, GETDATE()),
RAND() * 500
FROM master.dbo.spt_values;
This creates 100,000 random records.
🔹 Step 3: Run a Query and Read the Execution Plan
Now, run the following query:
sql
CopyEdit
SELECT * FROM Orders WHERE OrderDate > '2024-01-01';
📌 Reading the Execution Plan Output:
Operator Execution Type Cost
Table Scan Reads all rows 80%
Filter Applies WHERE condition 20%
Interpretation:
A Table Scan is happening because there is no index on OrderDate.
The scan processes all rows, making the query slow.
🔹 Step 4: Optimize the Query with an Index
Now, let's create an index on the OrderDate column.
sql
CopyEdit
CREATE INDEX IX_OrderDate ON Orders (OrderDate);
📌 Re-run the same query:
sql
CopyEdit
SELECT * FROM Orders WHERE OrderDate > '2024-01-01';
📌 New Execution Plan Output:
Operator Execution Type Cost
Index Seek Uses an index to find data 10%
Interpretation:
✅ The query now uses an Index Seek instead of a Table Scan, making it much faster.
✅ SQL Server directly finds matching rows instead of scanning all records.
🔹 Step 5: Comparing Joins Performance
Let's compare two queries using JOINs.
Scenario 1: Using a JOIN without Index
sql
CopyEdit
-- Create a Customers table
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100)
);
-- Insert sample data
INSERT INTO Customers
SELECT DISTINCT ABS(CHECKSUM(NEWID())) % 1000, 'Customer ' + CAST(ABS(CHECKSUM(NEWID()))
% 1000 AS VARCHAR)
FROM Orders;
-- Run a JOIN query
SELECT o.OrderID, c.CustomerName, o.TotalAmount
FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID;
📌 Execution Plan Analysis:
Operator Execution Type Cost
Hash Join Joins large tables 70%
Table Scan (Orders) Reads all rows 20%
Table Scan (Customers) Reads all rows 10%
Interpretation:
A Hash Join is used because CustomerID in Orders is not indexed.
The query scans both tables, which is inefficient.
🔹 Step 6: Optimize JOIN Performance
Let's create an index on CustomerID in Orders:
sql
CopyEdit
CREATE INDEX IX_CustomerID ON Orders (CustomerID);
📌 Re-run the JOIN query:
sql
CopyEdit
SELECT o.OrderID, c.CustomerName, o.TotalAmount
FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID;
📌 New Execution Plan Output:
Operator Execution Type Cost
Nested Loop Join Iterates through indexed data 30%
Index Seek (Orders) Uses an index for lookup 5%
Index Seek (Customers) Uses an index for lookup 5%
Interpretation:
✅ Now an Index Seek is used, reducing scan time.
✅ Nested Loops are used instead of a Hash Join, improving efficiency.
✅ The query runs significantly faster due to indexing.
🔹 Conclusion
Step Action Taken Performance Change
1 Table Scan (Slow Query) 🚨 High Cost (80%)
2 Added Index on OrderDate ✅ Switched to Index Seek (10%)
3 JOIN without Index 🚨 Hash Join (Slow)
4 Added Index on CustomerID ✅ Nested Loops Join (Fast)
Key Takeaways
✅ Reading Execution Plans: Identify expensive operations (scans, sorts, joins).
✅ Interpreting Execution Plans: Understand bottlenecks and optimization areas.
✅ Optimizing Queries: Use indexes, rewrite queries, and analyze joins for better performance.