SQL Performance Explained PDF
SQL Performance Explained PDF
PDF
Markus Winand
Scan to Download
Sql Performance Explained
Master SQL Performance for Optimal Database
Efficiency
Written by Bookey
Check more about Sql Performance Explained Summary
Listen Sql Performance Explained Audiobook
Scan to Download
About the book
"SQL Performance Explained" is an essential guide for
developers seeking to enhance the performance of their
databases across all major SQL platforms. This comprehensive
resource begins with fundamental concepts such as indexing
and the WHERE clause, then seamlessly navigates through the
intricacies of SQL statements, highlighting common pitfalls
associated with object-relational mapping (ORM) tools like
Hibernate. Key topics include the effective use of
multi-column indexes, appropriate application of SQL
functions, optimization of LIKE queries, efficient join
operations, data clustering for improved performance,
pipelined execution techniques for ORDER BY and GROUP
BY, as well as strategies for optimizing pagination queries and
understanding database scalability. Its systematic approach
makes "SQL Performance Explained" an indispensable
textbook and reference manual for every developer's library.
Scan to Download
About the author
Markus Winand is a distinguished expert in the field of
database performance optimization, renowned for his
insightful contributions to SQL and relational database
management systems. With a strong background in software
development and database technologies, Winand has dedicated
his career to helping developers and database administrators
understand the intricacies of SQL performance. Through his
extensive research, presentations, and publications, including
the widely acclaimed book "SQL Performance Explained," he
has become a sought-after speaker and educator, empowering
professionals to leverage best practices for optimizing
database queries and enhancing overall application efficiency.
His passion for clear communication and practical solutions
has made him a respected figure in the tech community, where
he continues to advocate for better understanding of SQL
performance principles across industries.
Scan to Download
Summary Content List
Chapter 1 : Functions
Scan to Download
Chapter 1 Summary : Functions
Introduction to Indexes
-
DB2
Scan to Download
: Supports function-based indexes on zOS; requires real
columns or triggers for expression results.
-
MySQL
: Does not support function-based indexes or virtual columns
(as of version 5). A real column is necessary for expressions.
-
Oracle
: Supports function-based indexes since 8i and introduced
virtual columns with 11g.
-
PostgreSQL
: Supports expression-based indexes.
-
SQL Server
: Supports computed columns that can be indexed.
Case-Insensitive Searches
Scan to Download
Optimization and Execution Plans
Statistics Collection
Scan to Download
Over Indexing Risks
Best Practices
Scan to Download
Example
Key Point:Understanding Index Utilization
Example:Imagine you're writing a SQL query to fetch
customer names from your database. If you want to
search without worrying about whether a name is
entered in upper or lower case, by thinking ahead, you
could apply the UPPER function to both your search
term and the indexed column. However, without the
right kind of index, this could result in your database
performing a full table scan, slowing down the results
significantly, which teaches you the importance of
choosing the correct indexing strategy.
Scan to Download
Critical Thinking
Key Point:Over Indexing Risks
Critical Interpretation:The author argues that excessive
indexing can harm performance due to redundancy and
increased overhead, but this perspective may overlook
scenarios where multiple indexes can enhance query
optimization in complex databases.
Scan to Download
Chapter 2 Summary : Bind Parameter
Bind Parameters
Scan to Download
2.
Performance
: The Oracle optimizer can reuse cached execution plans for
statements that are executed multiple times with different
values.
Column Histograms
Scan to Download
hinder this optimization because the actual values are
unknown during parsing, preventing effective use of
histograms.
The use of bind parameters may not always yield the best
performance relative to the unique distribution of data values.
-
Optimization Tips
:
- Use bind values generally, except when specific cases
show distinct performance benefits from histograms.
- Avoid parsing for execution plans unless necessary.
-
C#
```csharp
SqlCommand command = new SqlCommand("select
first_name, last_name from employees where subsidiary_id =
Scan to Download
@subsidiary_id", connection);
command.Parameters.AddWithValue("@subsidiary_id",
subsidiary_id);
```
-
Java
```java
PreparedStatement command =
connection.prepareStatement("select first_name, last_name
from employees where subsidiary_id = ?");
command.setInt(1, subsidiary_id);
```
-
Perl
```perl
my $sth = $dbh->prepare("select first_name, last_name
from employees where subsidiary_id = ?");
$sth->execute($subsidiary_id);
```
-
Scan to Download
PHP
```php
if ($stmt = $mysqli->prepare("select first_name, last_name
from employees where subsidiary_id = ?")) {
$stmt->bind_param("i", $subsidiary_id);
$stmt->execute();
}
```
-
Ruby
```ruby
dbh.prepare("select first_name, last_name from employees
where subsidiary_id = ?").execute(subsidiary_id);
```
Scan to Download
Oracle's database has introduced several features to improve
the compatibility of bind parameters and histograms:
-
CURSOR_SHARING
: Rewrites SQL to use bind parameters, primarily a
workaround for legacy applications.
-
Bind Peeking
: Uses the first execution's bind values during parsing.
-
Adaptive Cursor Sharing
: Allows multiple execution plans for the same statement
based on execution performance.
These features aim to manage the presence of data
distribution discrepancies, suggesting that applications
consider using literal values when substantial imbalances
exist in search keys.
Scan to Download
Example
Key Point:Use bind parameters for improved
security and performance.
Example:Imagine you're creating a web application
where users input their own data. By using bind
parameters, you ensure that user input is safely handled,
protecting against SQL injections that could
compromise your database. Additionally, when running
the same query multiple times with different user inputs,
you leverage the database's ability to reuse execution
plans, significantly speeding up retrieval times. This
way, your application remains both secure and efficient,
providing a seamless experience for your users.
Scan to Download
Critical Thinking
Key Point:Bind parameters are generally beneficial
for security and performance, but caution is
warranted.
Critical Interpretation:While Markus Winand suggests
that bind parameters enhance SQL execution due to
cached plans and protection against injection, one must
recognize that this universal application might not hold
under all circumstances. For instance, the performance
implications linked to bind parameters can vary
significantly depending on the actual data distribution.
In certain cases, particularly when histograms reveal
distinct data patterns, using literal values may deliver
superior performance, contradicting the notion that bind
parameters are always preferable. This highlights the
importance of context-specific analysis, inviting
skepticism about blanket statements in database
performance optimization. Sources such as 'SQL
Performance Tuning' by Dan Tow and Oracle's own
documentation on Adaptive Cursor Sharing and
Histograms substantiate the need for a more nuanced
approach.
Scan to Download
Chapter 3 Summary : NULL And
Indexes
Introduction
-
DB2
: Does not treat empty strings as NULL and includes NULL
in every index.
-
MySQL
Scan to Download
: Similar to DB2, it does not equate empty strings with
NULL and includes NULL in every index.
-
Oracle
: Unique in considering empty strings as NULL and
excluding rows from indexes if all indexed columns are
NULL.
-
PostgreSQL and SQL Server
: Like MySQL and DB2, do not treat empty strings as NULL
and include NULL in every index.
Indexing NULL
Scan to Download
Chapter 4 Summary : Searching For
Ranges
Scan to Download
-
Explicit Conditions
: SQL queries with clear start and stop conditions on a single
column (e.g., `DATE_OF_BIRTH`) can utilize an index
efficiently.
-
Multiple Conditions
: When a second column is involved, the order of columns in
the index matters. For instance, an index on
`DATE_OF_BIRTH` and `SUBSIDIARY_ID` will behave
differently than an index on the reverse order.
Scan to Download
DATE_OF_BIRTH First
: Performs poorly with subsidiary filtering because records
are spread across the index.
-
SUBSIDIARY_ID First
: Performs significantly better as it directly narrows down the
leaf nodes.
Scan to Download
Performance varies with the `LIKE` operator, heavily relying
on prefix selectivity. Searches with wildcards should avoid
postfix patterns for efficiency, as they result in a full index
scan.
Bitmap Indexes
Database-Specific Solutions
Scan to Download
accesses and bitmap merging.
Conclusion
Scan to Download
Chapter 5 Summary : Obfuscated
Conditions
Obfuscated Conditions
Dates
```sql
SELECT ... FROM ... WHERE TRUNC(date_column) =
TRUNC(sysdate - 1)
Scan to Download
```
-
Solution:
Tip:
Scan to Download
String representations of dates can also cause issues:
```sql
SELECT ... FROM ... WHERE TO_CHAR(date_column,
'YYYY-MM-DD') = '1970-01-01'
```
Instead, convert the string to native DATE representation:
```sql
SELECT ... FROM ... WHERE date_column =
TO_DATE('1970-01-01', 'YYYY-MM-DD')
```
Numeric Strings
```sql
SELECT ... FROM ... WHERE numeric_string = 42
```
-
Effective Query:
Scan to Download
```sql
SELECT ... FROM ... WHERE numeric_string =
TO_CHAR(42)
```
Using numeric types to store numbers is recommended to
prevent performance issues and exceptions due to type
mismatches.
Date/Time Concatenation
Scan to Download
Smart Logic
Tip:
Math Obfuscation
Scan to Download
SELECT numeric_number FROM table_name WHERE
numeric_number - 1000 > ?
```
Function-based indexing can resolve these issues. Transform
conditions to enable index usage:
```sql
CREATE INDEX a_minus_b ON table_name (a - b)
SELECT a, b FROM table_name WHERE a - b = 0
```
Scan to Download
Chapter 6 Summary : Most Selective
First
Topic Details
Most Selective First Consideration of column order in compound indexes; the most selective column is not
always the best choice.
INDEX SKIP SCAN Utilization: Putting the least selective column first
may enhance performance.
Better Compression: Can result in more effective data compression in some
scenarios.
Conclusion Focus on effective support for SQL statements in index design; the first column doesn't
always need to be the most selective.
Scan to Download
always the best practice.
1.
Statement Support
: The primary factor in defining a concatenated index is the
number of SQL statements it can support.
2.
Potential Benefits of Less Selective Columns
:
-
INDEX SKIP SCAN Utilization
: Sometimes placing the least selective column first can
improve performance.
-
Better Compression
: There are scenarios where this leads to more effective data
compression.
Install Bookey
Debunking the Myth App to Unlock Full Text and
Audio
Several experts have challenged the idea that the most
Scan to Download
Chapter 7 Summary : Dynamic SQL is
Slow
Scan to Download
hindering performance.
- Examples using Java and ORMs like Hibernate show how
to properly use bind parameters to rebuild structured queries
dynamically.
Scan to Download
- An index supports efficient data access but can be slowed
by poor conditions in SQL queries, like improper usage of
WHERE clauses.
- The concept of indexing is explained, including various
index types and how data structure interacts with
performance. Bad indexing can lead to performance drops.
Concatenated Indexes
Scan to Download
Optimizing Your SQL
Conclusion
Scan to Download
Best Quotes from Sql Performance
Explained by Markus Winand with Page
Numbers
View on Bookey Website and Generate Beautiful Quote Images
Scan to Download
coordinated with the DBAs.
Chapter 2 | Quotes From Pages 84-95
1.Bind variables are the best way to prevent SQL
injection.
2.The Oracle optimizer can re-use a cached execution plan if
the very same statement is executed multiple times.
3.However, the optimizer has to know the actual search term
to be able to use the statistics—that is, it must be provided
as literal value.
4.In the compiler analogy, it's like compiling the source code
every time you run the program.
5.In case of doubt, use bind parameters.
Chapter 3 | Quotes From Pages 96-115
1.SQL's NULL is a constant source of confusion.
Although the basic idea of NULL—to represent
missing data—is rather simple, there are
numerous side effects.
2.The most annoying “special” handling of the Oracle
database is that an empty string is considered NULL.
Scan to Download
3.The Oracle database does not include rows into an index if
all indexed columns are NULL.
4.Tip: Putting a column that cannot be NULL into an index
allows indexing NULL like any value.
5.A missing NOT NULL constraint is often causing count(*)
statements to do a full table scan.
6.In fact, virtual columns existed before Oracle 11g as an
implementation vehicle for function based indexes.
7.The oddity with that is that the SQL statement must use the
function—otherwise the index can't be used.
Scan to Download
Chapter 4 | Quotes From Pages 116-142
1.The most important performance factor of an
INDEX RANGE SCAN is the leaf node
traversal—keeping that small is the golden rule of
indexing.
2.The freedom to re-arrange it to support other queries is lost.
3.Index for equality first—then for ranges.
4.Only the LIKE prefix can serve as access predicate.
5.The database can—in principle—use one B-Tree index
only per table access.
Chapter 5 | Quotes From Pages 143-168
1.It's common practice to use the TRUNC function
to get rid of the time part: SELECT ... FROM ...
WHERE TRUNC(date_column) =
TRUNC(sysdate - 1)
2.The benefit of this technique is that a straight index on
DATE_COLUMN works.
3.The database will, consistently, transform the string into a
number.
Scan to Download
4.The purpose of bind parameters is to decrease parsing
overhead.
5.Use dynamic SQL if you need dynamic where clauses.
6.However, a function based index will work for all the
samples above.
Chapter 6 | Quotes From Pages 173-176
1.Most Selective First
2.the most important factor to consider when defining a
concatenated index is the number of statements it can
support.
3.Don't automatically put the most selective term first in a
concatenated index.
4.the directive to “put the most selective column first”... was
never a sensible rule of thumb.
5.the core of truth behind this myth is related to indexing
independent range conditions
Scan to Download
Chapter 7 | Quotes From Pages 179-311
1.Dynamic SQL is Slow - The core of truth behind
the 'Dynamic SQL is Slow' myth is rather simple;
dynamic SQL can be slow—when done wrong.
2.Using dynamic SQL, with bind parameters, allows the
optimizer to take the best execution plan for the particular
combination of where clauses.
3.The execution plan shows the very efficient INDEX
UNIQUE SCAN operation.
4.The key to the cache is basically the literal SQL
string—usually a hash of it. If there is no exact match, a
hard parse is triggered.
5.Never change a running system. At least not without
comprehensive testing beforehand.
Scan to Download
Sql Performance Explained Questions
View on Bookey Website
2.Question
Why are function based indexes necessary in some SQL
databases?
Answer:Function based indexes are necessary because they
allow the database to create an index based on the results of a
function, thus improving performance for queries that
involve expressions or transformations of data (like UPPER
Scan to Download
or LOWER) rather than direct comparisons to the raw
column values.
3.Question
How does the optimizer understand function based
indexes, and what must be done for them to be effective?
Answer:The optimizer recognizes function based indexes
when the exact expression used in the index definition
appears in SQL queries. Therefore, it's crucial to create a
function based index correctly and ensure that statistics for
both the index and the corresponding table are collected and
up-to-date.
4.Question
What precautions should be taken regarding the use of
user-defined functions with function based indexes?
Answer:User-defined functions must be deterministic to be
used with function based indexes; otherwise, they won't work
as intended since the index relies on consistent outputs for
given inputs. It's important to ensure that the functions don't
depend on changing or non-deterministic values.
Scan to Download
5.Question
What is a common mistake developers make regarding
indexing strategies?
Answer:A common mistake is over-indexing, believing that
adding more indexes will improve performance universally.
However, unnecessary indexes can lead to increased database
overhead, as each index takes up space and adds complexity
for data modifications.
6.Question
How can one effectively unify access paths in SQL to
reduce the number of required indexes?
Answer:By establishing standard expressions for queries, like
consistently using either UPPER or LOWER for
case-insensitive searches on the same column, one can utilize
a single index effectively, minimizing redundancy and
reducing the overall load on the database.
7.Question
What steps should be coordinated with DBAs when
dealing with statistics and indexes?
Answer:It is essential to collect statistics for both tables and
Scan to Download
their indexes simultaneously, and to back up existing
statistics before making updates to ensure the optimizer has
accurate data to base its decisions on. It's also imperative to
maintain open communication with DBAs regarding the
indexing strategies.
8.Question
Describe the trap posed by functions like SYSDATE in
the context of indexing.
Answer:Functions like SYSDATE cannot be used within an
indexed expression because they yield non-deterministic
results which change over time. This means while one might
intend to index a function for efficiency, it will not maintain
valid entries as data changes, leading to misleading index
results.
9.Question
In what scenarios would using function based indexes be
advantageous?
Answer:Function based indexes are particularly
advantageous in scenarios requiring complex searches or
Scan to Download
when dealing with transformed data representations, such as
case-insensitive queries or computations derived from
columns that aren’t straightforward comparisons.
Chapter 2 | Bind Parameter| Q&A
1.Question
What are bind parameters in SQL and why are they
recommended?
Answer:Bind parameters, also known as dynamic
parameters or bind variables, are placeholders (like
?, :name, or @name) used in SQL statements
instead of hard-coded values. They are
recommended because they enhance security by
preventing SQL injection attacks and improve
performance by allowing the database optimizer to
reuse execution plans for identical SQL statements
that differ only in the values of bind parameters.
2.Question
How do bind parameters prevent SQL injection attacks?
Answer:When using bind parameters, user input is never
Scan to Download
directly included in the SQL statement. Instead, the values
are provided separately as part of the API call, which means
they cannot alter the structure of the SQL statement. This
separation safeguards against malicious inputs attempting to
inject harmful SQL commands.
3.Question
Why is execution plan reusability important and how do
bind parameters affect it?
Answer:Execution plan reusability is important because it
reduces the overhead of parsing and optimizing SQL
statements each time they are executed. With bind
parameters, the database can use a cached execution plan for
statements, leading to improved performance. However, if
the structure of the query would change based on the
variable's value, it could prevent the optimal execution plan
from being used.
4.Question
What is the impact of column histograms on SQL
execution plans?
Scan to Download
Answer:Column histograms provide the Oracle optimizer
with data distribution information for different column
values, which influences the choice of execution plan. When
using literal values, the optimizer can create an execution
plan optimized for a specific value, allowing for better
performance than when bind parameters are used, which
obscure the exact values during plan creation.
5.Question
What is adaptive cursor sharing and how does it relate to
bind parameters?
Answer:Adaptive cursor sharing is a feature introduced in
Oracle 11g that allows the database to maintain multiple
execution plans for the same SQL statement based on the
specific bind values used during execution. This helps
improve performance by allowing the optimizer to create
tailored plans after noticing performance discrepancies in
earlier executions, thus bridging the gap between the use of
bind parameters and the need for optimal execution plans.
6.Question
Scan to Download
When should literal values be used instead of bind
parameters?
Answer:Literal values should be used when there is a
significant imbalance in the distribution of data for a certain
query, especially where it is known that a full table scan or a
specific execution plan would greatly enhance performance
compared to using cached queries with bind parameters.
7.Question
What are the potential pitfalls of using bind parameters
in SQL?
Answer:While bind parameters enhance security and
performance, they can lead to suboptimal execution plans if
the underlying data distribution is uneven and the optimizer
lacks the necessary information to make informed decisions.
Additionally, excessive reliance on bind parameters may
require additional measures, such as using literal values in
certain cases where performance might be heavily impacted.
8.Question
How does bind peeking work and why can it be
problematic?
Scan to Download
Answer:Bind peeking occurs when the Oracle optimizer
examines the actual values of bind parameters during the first
execution of a statement to determine the most efficient
execution plan. However, it is problematic because the
chosen execution plan may not be optimal for subsequent
executions with different values, leading to performance
inconsistencies.
9.Question
Can bind parameters alter the structure of an SQL
statement?
Answer:No, bind parameters cannot change the structure of
an SQL statement. They can only be used as substitutes for
specific values in clauses but cannot be used to dynamically
alter the SQL structure, such as changing the table being
queried.
10.Question
What is the SQL standard regarding positional
parameters?
Answer:The SQL standard defines only positional
Scan to Download
parameters, typically represented by the question mark (?).
While most databases and programming languages provide
support for named bind parameters as well, this is considered
a non-standard extension.
Chapter 3 | NULL And Indexes| Q&A
1.Question
What is the significance of understanding how different
SQL databases treat NULL values?
Answer:Understanding how different SQL
databases treat NULL values is crucial for
optimizing performance. Each database has its
unique handling of NULL, impacting how indexes
are built and how queries are executed. For
instance, Oracle considers an empty string as NULL
while MySQL does not. This distinction affects
query results and the efficiency of accessing records.
Without this knowledge, one might inadvertently
write queries that lead to full table scans instead of
utilizing indexes effectively.
Scan to Download
2.Question
Why does the Oracle database not include rows with
NULL in indexed columns?
Answer:Oracle does not include rows in an index if all
indexed columns are NULL because it assumes that these
rows provide no value for indexing. This can lead to
challenges as queries filtering on these NULL values might
require a full table scan, negating the benefits of indexing.
This is a fundamental difference compared to databases like
DB2 or SQL Server, which include NULLs in their indexes.
3.Question
What is a potential workaround in Oracle to index NULL
values?
Answer:A workaround to index NULL values in Oracle is to
introduce a non-NULL column into the index definition. By
combining a column that can never be NULL with a
NULLable column in an index, one can ensure that the index
includes rows where the NULL value exists. An example
includes creating a concatenated index that has a constant or
Scan to Download
guaranteed NOT NULL column, thereby allowing the
indexing of rows with NULL.
4.Question
How can NOT NULL constraints influence query
efficiency in Oracle?
Answer:NOT NULL constraints play a critical role in
optimizing query performance in Oracle. By enforcing these
constraints on indexed columns, the database can leverage
indexes for queries filtering on NULL values, ensuring that
the query executes faster using the index, rather than
performing a full table scan. Without these constraints, one
might encounter scenarios where the database fails to use the
index, leading to significant inefficiencies.
5.Question
What are virtual columns in Oracle, and how do they
relate to indexing NULL values?
Answer:Virtual columns in Oracle are derived from
expressions that can also be indexed. Introduced in Oracle
11g, these columns allow for the creation of indexes on
Scan to Download
computed values while maintaining properties like NOT
NULL. This feature is particularly useful for including
indexed expressions that control NULL handling, thereby
enhancing the efficiency of queries filtering on NULLs. By
leveraging virtual columns, one can ensure that the database
optimizes NULL entries effectively without the manual
workaround previously described.
6.Question
What is the main distinction between partial indexes in
PostgreSQL and the approach in Oracle?
Answer:The main distinction is that PostgreSQL has native
support for partial indexes, allowing for the creation of
indexes that only include a subset of rows based on a specific
condition. In contrast, Oracle does not support partial indexes
directly but uses an implicit 'column is not null' condition for
every index. This means Oracle indexes cannot contain any
entries where all indexed columns are NULL, which
effectively creates a different kind of partial indexing
mechanism.
Scan to Download
7.Question
How can one simulate partial indexes in Oracle using
functions?
Answer:One can simulate partial indexes in Oracle by
defining a function that returns NULL for certain values that
should not be indexed. For example, you can create a
function that returns NULL for states that indicate processed
tasks while returning the relevant values for active tasks. By
indexing this function and using it in queries, Oracle can
utilize the index for selecting specific states without
including the unwanted ones.
8.Question
What are the potential drawbacks of using user-defined
functions for indexing in Oracle?
Answer:The drawbacks of using user-defined functions for
indexing in Oracle include the lack of control over how the
optimizer views the function and the potential for
performance inefficiencies. The database does not impose a
NOT NULL constraint on values returned by user-defined
Scan to Download
functions unless specified, leading to scenarios where the
index may not be utilized effectively if the function could
return NULL unexpectedly. This could fall flat in complex
queries where function outcomes are unpredictable.
9.Question
What is the importance of knowing how to utilize indexed
columns for filtering NULL values in queries?
Answer:Knowing how to effectively utilize indexed columns
when filtering NULL values in queries is crucial for
achieving optimal database performance. Properly structured
indexes can dramatically reduce query time by allowing the
database to execute queries using indexed paths rather than
resorting to full table scans. This knowledge can help in
designing schemas and writing queries that are efficient,
thereby improving the overall performance of database
operations.
Scan to Download
Chapter 4 | Searching For Ranges| Q&A
1.Question
What is the main performance factor for an INDEX
RANGE SCAN?
Answer:The primary performance factor for an
INDEX RANGE SCAN is keeping the leaf node
traversal small. This emphasizes the importance of
knowing where the scan will start and
stop—particularly for queries involving SQL
inequality operators such as <, >, and BETWEEN.
2.Question
How does the column order in a concatenated index affect
SQL queries that use range conditions?
Answer:The column order in a concatenated index is crucial
when queries involve range conditions. Unlike equality
conditions, which can tolerate any column order, the order of
the columns will dictate the efficiency of range queries. If the
leading column in the index does not align with the range
conditions in the query, the database will not efficiently use
Scan to Download
the index.
3.Question
What happens when a second column is added to a query
that already contains inequality conditions?
Answer:Introducing a second column can complicate
indexing strategies. The efficiency of accessing the index
will heavily depend on which column is prioritized. For
example, if you index DATE_OF_BIRTH first and the query
includes a subsidiary condition, the index scan will need to
traverse a much larger portion of the index compared to
indexing SUBSIDIARY_ID first.
4.Question
Why is the FILTER predicate highlighted in execution
plans when it comes to performance?
Answer:FILTER predicates appear in execution plans to
indicate conditions used to narrow down the results during
leaf node traversals. They do not contribute to defining the
start and stop conditions for the range scan, which is why
their effectiveness in reducing the scanned range is limited.
Scan to Download
5.Question
What is the rule of thumb for index creation regarding
equality and range conditions?
Answer:The rule of thumb is to index for equality conditions
first and then range conditions. This ensures that the
selectivity of the index improves overall performance for
queries using a combination of both types of conditions.
6.Question
What should be avoided when using the LIKE operator to
ensure better performance?
Answer:Avoid using anywhere LIKE searches, such as LIKE
'%TERM%', as they require scanning the entire index.
Instead, it's better to use prefix searches that start with fixed
characters, like 'TERM%', which allows the database to
optimize index access.
7.Question
How does the performance of concatenated indexes
compare to single-column indexes?
Answer:In most cases, a single concatenated index that
contains multiple columns is more efficient than multiple
Scan to Download
single-column indexes. This is because a concatenated index
allows the database to optimize the searching process across
the linked conditions, reducing the number of scans required.
8.Question
What are bitmap indexes and when are they useful?
Answer:Bitmap indexes are useful for ad-hoc queries,
particularly in data warehousing, as they allow efficient
combining of multiple conditions. However, they are not
suited for high-concurrency environments due to poor
performance on insert, update, and delete operations.
9.Question
What is the significance of prefix selectivity in LIKE
searches?
Answer:Prefix selectivity refers to the substring before the
first wildcard in a LIKE search term. The longer the prefix,
the shorter the index range scanned, resulting in better
performance. Only the prefix determines whether an index
can be utilized effectively.
10.Question
How can one improve the execution plan when dealing
Scan to Download
with multiple independent range conditions?
Answer:When multiple independent range conditions are
present, one approach is to combine conditions into a single
concatenated index, accepting that it may include filter
predicates. Another approach could be leveraging bitmap
indexes, if acceptable, to take advantage of merging
capabilities.
Chapter 5 | Obfuscated Conditions| Q&A
1.Question
What are obfuscated conditions in SQL, and how do they
affect index usage?
Answer:Obfuscated conditions are WHERE clauses
that are constructed in such a way that they do not
allow the database to utilize indexes effectively. An
example of this is using functions on indexed
columns, such as TRUNC(date_column) which
prevents the use of an index on date_column because
the database cannot recognize it as an indexable
value. This leads to full table scans and can severely
Scan to Download
degrade query performance.
2.Question
How does the use of TRUNC on DATE columns in Oracle
hinder index usage?
Answer:Using TRUNC on a DATE column creates an
expression that does not match the column directly,
essentially making it a different value to the database. For
instance, a query such as SELECT ... WHERE
TRUNC(date_column) = TRUNC(sysdate - 1) cannot use an
index on date_column because the function call alters its
nature compared to the raw column, leading to suboptimal
performance.
3.Question
What is a function-based index, and when should it be
used?
Answer:A function-based index allows the database to index
the result of a function applied to a column, thus handling
cases where direct indexing is insufficient. For example, you
can create an index like CREATE INDEX index_name ON
Scan to Download
table_name (TRUNC(date_column)) to support queries that
use TRUNC, effectively allowing those queries to utilize the
index and improve performance.
4.Question
What is a better approach than using TRUNC for date
filtering to ensure index usage?
Answer:Rephrasing date conditions as explicit range queries
is a better option. For instance, using SELECT ... WHERE
date_column BETWEEN quarter_begin(?) AND
quarter_end(?) ensures that the database can utilize a
straightforward index on date_column since it uses a natural
range. This approach avoids function application and
promotes better performance.
5.Question
What problems arise from string representation of
numeric values in SQL queries?
Answer:When numeric values are stored as strings, issues
arise when comparisons are done using numeric types. For
example, using SELECT ... WHERE numeric_string = 42
Scan to Download
(without quotes) forces the database to convert
numeric_string to a number, which can prevent index range
scans. Instead, you should use string comparisons, like
SELECT ... WHERE numeric_string = TO_CHAR(42) to
effectively utilize indexes.
6.Question
Why is using NULL with bind parameters detrimental for
query performance?
Answer:When querying with NULL conditions, the database
cannot optimize the execution plan effectively because it
must account for all possible filter combinations, leading to
inefficiencies and a full table scan. It's better to construct
SQL dynamically to include only the necessary filters,
allowing the database to make better use of indexes.
7.Question
What is the relationship between dynamic SQL and
performance in SQL databases?
Answer:Dynamic SQL allows for better performance because
it can be optimized at runtime based on the actual parameters
Scan to Download
provided during execution. This helps the database to
generate a more efficient execution plan tailored to the
specific query, thus improving speed. Using bind parameters
in dynamic SQL enhances security and performance as it
reduces parsing overhead.
8.Question
Can mathematical expressions in SQL prevent index
usage?
Answer:Yes, mathematical expressions such as SELECT
numeric_number FROM table_name WHERE
numeric_number - 1000 > ? typically prevent index usage
since the database would need to compute values rather than
use direct index lookups. This negates the benefit of having
an index for those columns.
9.Question
What can be done to retain index functionality when
dealing with equations in SQL?
Answer:Transforming the equation to isolate the column on
one side while constants are moved to the other, like
Scan to Download
SELECT a, b FROM table_name WHERE a - b = 0, can
help. Additionally, creating a function-based index like
CREATE INDEX a_minus_b ON table_name (a - b) allows
the database to utilize indexes in scenarios involving
mathematical transformations.
10.Question
What practices should be avoided to improve SQL query
performance, particularly with data types?
Answer:Avoid using functions on indexed columns, avoid
mixing different data types in comparisons, refraining from
using NULL checks unnecessarily since they negate index
usage, and steering clear of static queries with bind
parameters that introduce complexity. Instead, prefer clear,
direct queries that can leverage indexes effectively.
Chapter 6 | Most Selective First| Q&A
1.Question
What is the common myth about building concatenated
indexes in SQL?
Answer:The common myth is that the most selective
Scan to Download
column should always be placed in the first position
of a concatenated index.
2.Question
Why is it incorrect to always prioritize the most selective
column in concatenated indexes?
Answer:Prioritizing the most selective column is incorrect
because the effectiveness of an index depends on how many
statements it can support. Sometimes, placing the least
selective column first can enhance performance for certain
queries, facilitate index skip scans, and improve
compression.
3.Question
What are the potential benefits of placing a less selective
column before a more selective one in a concatenated
index?
Answer:Benefits include better support for index skip scans
and enhanced index compression which can lead to improved
performance in specific scenarios.
4.Question
Who are some of the experts that challenge the myth
Scan to Download
about selective columns in index design?
Answer:Experts challenging this myth include Tom Kyte,
who highlights it in his book, and Guy Harrison, who
provides guidelines against automatically placing the most
selective term first. Jonathan Lewis also criticizes the myth,
noting it lacks reason after database version 6.0.
5.Question
What is the core truth about index design related to index
selectivity?
Answer:The core truth is that selectivity should primarily be
considered when indexing independent range conditions,
rather than being a blanket rule for index design.
6.Question
What should a database designer consider as the most
important factor when creating concatenated indexes?
Answer:The most important factor should be the number of
statements that the index can support, rather than just the
selectivity of the columns.
7.Question
What advanced indexing technique is mentioned in
Scan to Download
relation to index column ordering?
Answer:INDEX SKIP SCAN is an advanced technique that
may benefit from placing less selective columns before more
selective ones in concatenated indexes.
Scan to Download
Chapter 7 | Dynamic SQL is Slow| Q&A
1.Question
What is the main issue with dynamic SQL, and how can it
be mitigated for performance?
Answer:The main issue with dynamic SQL is that it
can introduce high parsing overhead due to varying
search terms not being cached, which requires the
database to recreate the execution plan frequently.
This can lead to poor performance. To mitigate this,
one should use bind parameters instead of
concatenating dynamic strings into SQL queries.
This not only improves security by preventing SQL
injection but also allows the database to cache
execution plans effectively, optimizing performance.
2.Question
How does dynamic SQL differ from embedded SQL in
terms of execution?
Answer:Dynamic SQL allows the structure of the SQL
statement itself to change at runtime, unlike embedded SQL,
Scan to Download
which is static and compiled into the program's source code.
Dynamic SQL is processed at runtime as a string, leading to
more flexibility but potential performance issues if not
handled correctly.
3.Question
Why is using bind parameters crucial when executing
dynamic SQL queries?
Answer:Using bind parameters in dynamic SQL queries is
crucial because it helps prevent SQL injection, allows for
better security, and enhances performance by enabling the
database to reuse cached execution plans. Without bind
parameters, the SQL statements will differ with each
execution, causing the database to avoid caching and thus
increasing overhead.
4.Question
What are the consequences of using literals in SQL
instead of bind parameters?
Answer:Using literals in SQL can expose applications to
SQL injection attacks and can lead to increased parsing
Scan to Download
overhead due to a lack of plan reuse. This means that the
database must recreate execution plans for each unique literal
value in the query, which can severely degrade performance,
especially if the query runs frequently with different literal
values.
5.Question
Can you give an example of how to implement
parameters using different programming languages?
Answer:Certainly! In Java, you can use a PreparedStatement
to implement parameters like this: PreparedStatement stmt =
con.prepareStatement("SELECT first_name FROM
employees WHERE employee_id = ?"); stmt.setInt(1,
employeeId);. In Python using pandas, you might write:
sql_query = "SELECT first_name FROM employees
WHERE employee_id = ?"; cursor.execute(sql_query,
(employeeId,)). This pattern applies across languages like
C#, PHP, and others, emphasizing the importance of using
bind variables.
6.Question
Scan to Download
Why does dynamic SQL sometimes lead to slower
performance, and what is a solution?
Answer:Dynamic SQL can lead to slower performance
because it may cause frequent hard parsing, especially when
different literals are involved in the queries. A solution is to
prepare statements prior to execution using prepared
statements, which allows for repeated execution without the
need for new parsing, effectively reducing overhead.
7.Question
What are the differences between hard parsing and soft
parsing in SQL?
Answer:Hard parsing involves building a new execution plan
from scratch, which is resource-intensive as it evaluates all
components of the SQL. Soft parsing, on the other hand,
makes use of a cached execution plan that has been stored
from a previous query execution. It is much faster because it
bypasses the need to re-evaluate the SQL, only performing
minor checks such as access rights.
8.Question
Scan to Download
How does the order of columns in a concatenated index
affect its performance?
Answer:The order of columns in a concatenated index
significantly affects its usability and performance, especially
when dealing with queries that involve equality and range
conditions. The most selective columns—or those most
likely to reduce the number of scanned entries—should
generally be placed first in the index definition. This way, it
maximizes efficiency by eliminating non-matching entries
early in the access process.
9.Question
What is the impact of SQL NULL on indexing and
performance?
Answer:In Oracle, rows with all indexed columns set to
NULL are not included in the index, meaning performance
can degrade for queries that check for NULL values. To
enable indexing on NULLs, one can include at least one
non-NULL column in concatenate indexes. Understanding
how NULLs interact with indexes is crucial for designing
Scan to Download
efficient queries.
10.Question
Explain the importance of statistics in SQL optimization
and how to maintain them.
Answer:Statistics provide the query optimizer with vital
information about data distribution, which helps in creating
efficient execution plans. Maintaining them involves
regularly gathering statistics using built-in database tools,
such as the DBMS_STATS package in Oracle, to ensure the
optimizer works with the most accurate and up-to-date
information, preventing performance degradation.
11.Question
How should one handle date comparisons in SQL to
optimize index usage?
Answer:To optimize index usage for date comparisons in
SQL, avoid using functions like TRUNC on date columns, as
this prevents index scans. Instead, structure queries to
leverage explicit range conditions so that the query can
utilize the underlying indexes effectively, thus improving
Scan to Download
performance.
12.Question
Describe a situation where dynamic SQL can outperform
static SQL despite the myth that ‘dynamic SQL is slow’.
Answer:In scenarios where queries may vary significantly
based on user input or application conditions, dynamic SQL
can outperform static SQL by allowing the application to
compile and optimize queries for specific search criteria
dynamically, improving execution efficiency when managed
correctly, especially when utilizing prepared statements and
bind parameters.
13.Question
How do function-based indexes aid in running efficient
queries, particularly with functions like UPPER or
TRUNC?
Answer:Function-based indexes allow indexing on
expressions involving functions such as UPPER or TRUNC,
enabling efficient query executions that involve these
functions in their WHERE clauses. This indexing helps the
database quickly access records matching transformed values
Scan to Download
without incurring the performance hit of evaluating each row
during query execution.
Scan to Download
Sql Performance Explained Quiz and
Test
Check the Correct Answer on Bookey Website
Scan to Download
rows from indexes if all indexed columns are
NULL.
2.MySQL treats empty strings as NULL and does not include
NULL in every index.
3.Oracle requires an indexed column with a NOT NULL
constraint to support queries like 'WHERE column IS
NULL'.
Scan to Download
Chapter 4 | Searching For Ranges| Quiz and Test
1.SQL inequality operators like <, >, and
BETWEEN can be optimized just like exact key
lookups.
2.Rearranging the order of columns in a concatenated index
can always fit different queries and improve optimization.
3.Using single indexes for multiple columns is always better
than using a single multi-column index.
Chapter 5 | Obfuscated Conditions| Quiz and Test
1.Obfuscated conditions in SQL queries can hinder
proper index usage and are considered
anti-patterns.
2.Using the TRUNC function on DATE columns enhances
index usage for query performance.
3.Concatenating DATE and TIME columns to apply filters
can improve query performance.
Chapter 6 | Most Selective First| Quiz and Test
1.The most selective column should always be placed
first in a compound index for optimal
Scan to Download
performance.
2.The primary factor in defining a concatenated index is its
ability to support a large number of SQL statements.
3.Experts unanimously agree that the most selective column
should always be prioritized in concatenated indexes.
Scan to Download
Chapter 7 | Dynamic SQL is Slow| Quiz and Test
1.Dynamic SQL is inherently slow regardless of how
it's implemented.
2.Using bind parameters with dynamic SQL can help
improve performance by allowing execution plan reuse.
3.Poorly constructed WHERE clauses can lead to
performance drops in SQL queries.
Scan to Download