0% found this document useful (0 votes)
40 views7 pages

Advanced SQL Query Optimization Techniques

Uploaded by

tinile
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views7 pages

Advanced SQL Query Optimization Techniques

Uploaded by

tinile
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Advanced SQL Query Optimization

Techniques
Yes, there are numerous additional optimization techniques beyond execution order.
Here's an exhaustive list of SQL optimization strategies:

Index-Based Optimizations

1. Covering Indexes

-- Instead of separate lookups


SELECT customer_id, email, created_date
FROM customers
WHERE status = 'active';

-- Create covering index


CREATE INDEX idx_customers_covering ON customers (status) INCLUDE (customer_id, email,
created_date);

2. Composite Index Order

-- Less optimal index usage


CREATE INDEX idx_order_date_status ON orders (order_date, status);
WHERE status = 'completed' AND order_date > '2024-01-01';

-- Better index order (most selective column first)


CREATE INDEX idx_status_date ON orders (status, order_date);
WHERE status = 'completed' AND order_date > '2024-01-01';

3. Partial Indexes

-- Index only relevant data


CREATE INDEX idx_active_customers ON customers (customer_id) WHERE status = 'active';

JOIN Optimizations

4. JOIN Order Optimization


-- Less optimal - large table first
SELECT * FROM large_table lt
JOIN small_table1 st1 ON [Link] = [Link]
JOIN small_table2 st2 ON [Link] = [Link];

-- Better - start with smallest result set


SELECT * FROM small_table1 st1
JOIN small_table2 st2 ON st1.common_id = st2.common_id
JOIN large_table lt ON [Link] = [Link];

5. Semi-JOIN vs EXISTS

-- Less optimal with IN


SELECT * FROM customers c
WHERE c.customer_id IN (SELECT customer_id FROM orders WHERE amount > 1000);

-- Better with EXISTS


SELECT * FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id AND amount > 1000);

Subquery Optimizations

6. Correlated vs Non-Correlated Subqueries

-- Less optimal - correlated subquery


SELECT customer_id,
(SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.customer_id) as order_count
FROM customers c;

-- Better - window function


SELECT DISTINCT customer_id,
COUNT(*) OVER (PARTITION BY customer_id) as order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id;

7. CTE vs Subqueries for Readability

-- Complex nested subquery


SELECT * FROM (
SELECT customer_id, total_amount,
ROW_NUMBER() OVER (ORDER BY total_amount DESC) as rn
FROM (
SELECT customer_id, SUM(amount) as total_amount
FROM orders GROUP BY customer_id
)t
) ranked WHERE rn <= 10;

-- Better with CTE


WITH customer_totals AS (
SELECT customer_id, SUM(amount) as total_amount
FROM orders GROUP BY customer_id
),
ranked_customers AS (
SELECT customer_id, total_amount,
ROW_NUMBER() OVER (ORDER BY total_amount DESC) as rn
FROM customer_totals
)
SELECT * FROM ranked_customers WHERE rn <= 10;

Function and Expression Optimizations

8. Avoid Functions on Indexed Columns

-- Less optimal - function prevents index usage


SELECT * FROM orders WHERE YEAR(order_date) = 2024;

-- Better - range condition uses index


SELECT * FROM orders WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';

9. CASE vs Multiple OR Conditions

-- Less optimal
SELECT * FROM products
WHERE category = 'Electronics' OR category = 'Computers' OR category = 'Mobile';

-- Better for index usage


SELECT * FROM products
WHERE category IN ('Electronics', 'Computers', 'Mobile');

Aggregate Function Optimizations

10. COUNT(*) vs COUNT(column)

-- Less optimal
SELECT COUNT(customer_id) FROM customers WHERE status = 'active';

-- Better - COUNT(*) is faster


SELECT COUNT(*) FROM customers WHERE status = 'active';

11. Conditional Aggregation

-- Less optimal - multiple queries


SELECT COUNT(*) FROM orders WHERE status = 'completed';
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- Better - single query
SELECT
COUNT(CASE WHEN status = 'completed' THEN 1 END) as completed_orders,
COUNT(CASE WHEN status = 'pending' THEN 1 END) as pending_orders
FROM orders;

Data Type Optimizations

12. Appropriate Data Types

-- Less optimal - oversized data types


CREATE TABLE products (
id BIGINT, -- INT might suffice
price DECIMAL(20,4), -- DECIMAL(10,2) might suffice
status VARCHAR(255) -- CHAR(1) for single character status
);

-- Better - right-sized data types


CREATE TABLE products (
id INT,
price DECIMAL(10,2),
status CHAR(1)
);

UNION Optimizations

13. UNION vs UNION ALL

-- Less optimal - removes duplicates unnecessarily


SELECT customer_id FROM customers WHERE region = 'North'
UNION
SELECT customer_id FROM customers WHERE region = 'South';

-- Better - if duplicates don't matter


SELECT customer_id FROM customers WHERE region IN ('North', 'South');

-- Or if separate queries needed and no duplicates exist


SELECT customer_id FROM customers WHERE region = 'North'
UNION ALL
SELECT customer_id FROM customers WHERE region = 'South';

Window Function Optimizations

14. Window Functions vs Self-Joins


-- Less optimal - self-join
SELECT o1.order_id, [Link],
(SELECT COUNT(*) FROM orders o2 WHERE [Link] > [Link]) as rank
FROM orders o1;

-- Better - window function


SELECT order_id, amount,
RANK() OVER (ORDER BY amount DESC) as rank
FROM orders;

Query Structure Optimizations

15. **Avoid SELECT ***

-- Less optimal
SELECT * FROM large_table WHERE condition;

-- Better - specify only needed columns


SELECT id, name, email FROM large_table WHERE condition;

16. LIMIT with ORDER BY Optimization

-- Less optimal - sorts entire result set


SELECT * FROM large_table ORDER BY created_date DESC LIMIT 10;

-- Better - with appropriate index on created_date


CREATE INDEX idx_created_date_desc ON large_table (created_date DESC);
SELECT * FROM large_table ORDER BY created_date DESC LIMIT 10;

Advanced Techniques

17. Query Hints and Plan Control

-- Database-specific hints (SQL Server example)


SELECT /*+ INDEX(orders, idx_order_date) */ *
FROM orders
WHERE order_date > '2024-01-01';

18. Batch Processing

-- Instead of processing millions of rows at once


UPDATE large_table SET status = 'processed' WHERE condition;

-- Process in batches
UPDATE large_table SET status = 'processed'
WHERE id IN (SELECT id FROM large_table WHERE condition LIMIT 1000);

19. Materialized Views

-- For frequently accessed complex aggregations


CREATE MATERIALIZED VIEW monthly_sales AS
SELECT DATE_TRUNC('month', order_date) as month,
SUM(amount) as total_sales,
COUNT(*) as order_count
FROM orders
GROUP BY DATE_TRUNC('month', order_date);

20. Partitioning Strategies

-- Partition large tables by date


CREATE TABLE orders_partitioned (
order_id INT,
order_date DATE,
amount DECIMAL(10,2)
) PARTITION BY RANGE (order_date);

Performance Monitoring

21. Query Execution Plan Analysis

 Use EXPLAIN PLAN or EXPLAIN ANALYZE

 Monitor for table scans, nested loops, and hash joins

 Check for missing indexes

22. Statistics Maintenance

 Keep table statistics updated

 Use ANALYZE TABLE regularly

 Monitor cardinality estimates

Database-Specific Optimizations

23. Connection Pooling

 Reuse database connections

 Avoid connection overhead


24. Prepared Statements

-- Reuse execution plans


PREPARE stmt FROM 'SELECT * FROM customers WHERE customer_id = ?';

25. Bulk Operations

-- Instead of multiple INSERT statements


INSERT INTO table VALUES (1, 'a'), (2, 'b'), (3, 'c');

-- Or use BULK INSERT for large datasets

The key is to profile your queries, understand your data patterns, and apply the most
relevant optimizations based on your specific use case and database system.

You might also like