0% found this document useful (0 votes)

140 views77 pages

Query Optimization in Relational Database Systems

Query optimization in relational database systems involves finding the most efficient way to evaluate a query by considering different execution plans composed of relational algebra operations. The optimizer represents each possible execution strategy as a query evaluation plan (QEP) and aims to select the best plan to minimize the cost of computing the query results. Some techniques used in optimization include pushing selections down, reordering joins and applying indexes to improve performance.

Uploaded by

Gilbert Heß

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

140 views77 pages

Query Optimization in Relational Database Systems

Uploaded by

Gilbert Heß

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Query Optimization in

Relational Database Systems

It is safer to accept any chance
that offers itself, and extemporize
a procedure to fit it, than to get a
good plan matured, and wait
for a chance of using it.

Thomas Hardy (1874)

in Far from the Madding Crowd

CS3223: Query Optimization 1

Query Optimization
• Since each relational op returns a relation, ops can be
composed!
• Queries that require multiple ops to be composed may
be composed in different ways - thus optimization is
necessary for good performance, e.g. A B C D can
be evaluated as follows:
• (((A B) C) D)
• ((A B) (C D))
• ((B A) (D C))
• …

CS3223: Query Optimization 2

Query Optimization
• Each strategy can be represented as a query evaluation plan
(QEP) - Tree of R.A. ops, with choice of algorithms for each
op.
SM HJ
NL
D
NL HJ INL
C

A B A B C D

(((A B) C) D) ((A B) (C D))

• Goal of optimization: To find the “best” plan that compute the

same answer (to avoid “bad” plans)
CS3223: Query Optimization 3
Motivating Examples

Sailors (sid: integer, sname: string, rating: integer, age: real)

Reserves (sid: integer, bid: integer, day: dates, rname: string)

• Reserves:
• Each tuple is 40 bytes long, 100 tuples per page, 1000 pages.
• Sailors:
• Each tuple is 50 bytes long, 80 tuples per page, 500 pages.

CS3223: Query Optimization 4

SELECT [Link]
FROM Reserves R, Sailors S
WHERE [Link]=[Link] AND
Example
[Link]=100 AND [Link]>5

sname
sname
sname
rating > 5

bid=100 rating > 5

sid=sid
sid=sid

bid=100 Sailors bid=100 rating > 5

sid=sid

Reserves Reserves Sailors

Sailors Reserves
Logical plans
CS3223: Query Optimization 5
SELECT [Link]
FROM Reserves R, Sailors S
WHERE [Link]=[Link] AND
Example (Cont)
[Link]=100 AND [Link]>5
Query Evaluation Plan:
• Cost: 500+500*1000 I/Os (On-the-fly)
sname
• Memory: 3

bid=100 rating > 5 (On-the-fly)

(Page Nested Loops)

sid=sid

Sailors Reserves

Physical plan
CS3223: Query Optimization 6
Alternative Plan 1 (No Indexes)
• Main difference: push selections down
• Assume 5 buffers, T1 = 10 pages (100 boats,
uniform distribution), T2 = 250 pages (10 ratings,
(On-the-fly)
uniform distribution) sname

• Cost of plan:
• Scan Reserves (1000) + write temp T1 (10 pages, if we (Sort-Merge)
have 100 boats, uniform distribution) sid=sid
• Scan Sailors (500) + write temp T2 (250 pages, if we
(T1) (T2)
have 10 ratings) bid=100 rating > 5
• Sort T1 (2*2*10), sort T2 (2*4*250), merge (10+250)
• Total: 4060 page I/Os Reserves Sailors

CS3223: Query Optimization 7

Alternative Plan 2 (With Indexes)
• Clustered index on bid of Reserves
• 100,000/100 = 1000 tuples on 1000/100 = 10 pages
• Hash index on sid (format 2). Join column sid is a key sname
(On-the-fly)
for Sailors
• INL with pipelining (outer is not materialized) rating > 5 (On-the-fly)
• Project out unnecessary fields from outer doesn’t help
• At most one matching tuple, unclustered index (INL
sid=sid with pipelining )
on sid OK
• Did not push “rating>5” before the join. Why? bid=100 Sailors
• Cost? (Use hash
index; do
not write
• Selection of Reserves tuples (10 I/Os); for each, must Reserves result to
temp)
get matching Sailors tuple (1000*2.2); total 2210 I/Os

CS3223: Query Optimization 8

Overview of Query Optimization
SQL query
parse
parse tree
convert
answer
logical query plan
execute
apply laws
Pi
“improved” l.q.p
estimate result sizes pick best
l.q.p. +sizes {P1,C1>...}
consider physical plans estimate costs
{P1,P2,…..}
CS3223: Query Optimization 9
Example: SQL query

SELECT sname
FROM Sailors
WHERE sid IN (
SELECT sid
FROM Reserves
WHERE rname LIKE ‘Tan%’
);

(Find names of sailors whose reservation is made by

someone whose name begins with “Tan”)
CS3223: Query Optimization 10
Example: Parse Tree
<Query>

<SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> <Tuple> IN <Query>

sname Sailors <Attribute> ( <Query> )

sid <SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> <Attribute> LIKE <Pattern>

sid Reserves rname ‘Tan%’

CS3223: Query Optimization 11
Example: Improved Logical
Logical Query Plan Query Plan
sname sname
sid=sid
sid=sid

Sailors sid Sailors sid
rname LIKE ‘Tan%’ rname LIKE ‘TAN%’
Reserves Can we improve further?
Reserves
e.g. Push project to Sailors?
CS3223: Query Optimization 12
Example: Estimate Result Sizes

Need expected size

Sailors 


Reserves

CS3223: Query Optimization 13

Example: One Physical Plan

Hash join Parameters: join order,

memory size, project attributes,...

SEQ scan index scan

Parameters:
Select Condition,...

Sailors Reserves

CS3223: Query Optimization 14

Example: Estimate Plan Cost to
Find Optimal QEP
QEPs

P1 P2 …. Pn

C1 C2 …. Cn

Pick best!
CS3223: Query Optimization 15
Relational Algebra Equivalences
What about p1p2 … pn(R)?

CS3223: Query Optimization 16

Relational Algebra Equivalences

CS3223: Query Optimization 17

Bags vs. Sets
R = {a,a,b,b,b,c}
S = {b,b,c,c,d}
RS=?

• Option 1 SUM
R  S = {a,a,b,b,b,b,b,c,c,c,d}
• Option 2 MAX
R  S = {a,a,b,b,b,c,c,d}

CS3223: Query Optimization 18

“SUM” is implemented
• Use “SUM” option for bag unions
• Some rules cannot be used for bags
• e.g. A s (B s C) = (A s B) s (A s C)

Let A, B and C be {x}

B B C = {x, x} A B (B B C) = {x}
A B B = {x} A B C = {x} (A B B) B (A B C) = {x, x}

CS3223: Query Optimization 19

Query Optimizer
• Find the “best” plan (more often avoid the bad plans)
• Comprises the following
• Plan space
• huge number of alternative, semantically equivalent plans
• computationally expensive to examine all
• Conventional wisdom: avoid bad plans
• need to include plans that have low cost
• Enumeration algorithm (Search space)
• search strategy (optimization algorithm) that searches through the plan
space
• has to be efficient (low optimization overhead)
• Cost model
• facilitate comparisons of alternative plans
• has to be “accurate”
CS3223: Query Optimization 20
Join Plan Notation

CS3223: Query Optimization 21

Plan Space
• Left-deep trees: right child has to be a base table
• Right-deep trees: left child has to be a base table
• Deep trees: one of the two children is a base table
• Bushy tree: unrestricted
space of bushy tree is the largest,
it also contains left/right/deep trees

D D

C C

A B C D A B A B
Bushy tree Left-deep tree Deep tree
in general, good plans can be found in left/deep tree
CS3223: Query Optimization 22
Query Plan Space for R S T

This has not accounted for the algorithms!

CS3223: Query Optimization 23
Search Algorithms (and Search Space)
• Exhaustive (Complete space)
• enumerate each possible plan, and pick the best
• Greedy Techniques (Very small – polynomial)
• smallest relation next
• smallest result next
• typically polynomial time complexity
• Randomized/Transformation Techniques (Large space – can be
complete if you run the algorithms indefinitely)
• System R approach (Almost complete)
• Dynamic Programming with Pruning

CS3223: Query Optimization 24

Multi-Join Queries
• Focus on multi-join queries first means dont consider selection/projection

• Join is the most expensive operations

• Selections and projections can be pushed down as early as
possible
• Query
• A query graph whose nodes are relations and edges represent a
join condition between the two nodes

CS3223: Query Optimization 25

Greedy Algorithm (Example)
• Heuristic 1: Smallest relation next
• Suppose Ri < Rk for i < k
R1

All plans must begin with R1

R2
small relation --> table will be small
R1 R3 --> intermediate result will be small

R5 R4 R2
R3 R4 R5
Join Graph All plans beginning with R2-R5 have been pruned!

CS3223: Query Optimization 26

Greedy Algorithm (Example)
• Smallest relation next
• What if R1 < R5 < R3 < R2 < R4???
using "smallest relation next" will result in large cross products and large
intermediate table

R2
Another heuristic:
R1 R3
Smallest result next?

R5 R4

CS3223: Query Optimization 27

plan space
algorithm

Cost Models
cost model

• Typically, a combination of CPU and I/O costs

• Objective is to be able to rank plans
• exact value is not necessary
• Relies on
• statistics on relations and indexes
• formulas to estimate CPU and I/O cost
• formulas to estimate selectivities of operators and intermediate
results

CS3223: Query Optimization 28

Cost Estimation
• For each plan considered, must estimate cost:
• Must estimate cost of each operation in plan tree
• Depends on input cardinalities
• Depends on buffer size, availability of indexes, algorithms used,
etc. ?
• We’ve already discussed how to estimate the cost of
operations (sequential scan, index scan, joins, etc.) ? R4

• Must estimate size of result for each operation in ?

R3
tree!
R1 R2
• Use information about the input relations
• Typical assumptions like uniform distribution of data and
independence of predicates can simplify size estimation but is
error prone

CS3223: Query Optimization 29

Statistics and Catalogs
• Need information about the relations and indexes involved
Catalogs typically contain at least:
• # tuples of R (||R||), #bytes in each R tuple (S(R))
• # blocks/pages to hold all R tuples (|R|)
• # distinct values in R for attribute A (V(R,A))
• NPages for each index
• Index height, low/high key values (Low/High) for each tree index
• Catalogs updated periodically
• Updating whenever data changes is too expensive; lots of
approximation anyway, so slight inconsistency ok

CS3223: Query Optimization 30

Estimation Assumptions
• Uniformity assumption
• Uniform distribution of attribute values
• Independence assumption
• Independent distribution of values in different attributes
• Inclusion assumption
• For R R.A=S.B S, if V(R, A) ≤ V(R, B), then A(R)  B(S)
• V(R, A) is the number of distinct values of R.A
assume key-foreign key relationship

CS3223: Query Optimization 31

Example
R A B C D A: 20 byte string
cat 1 10 a B: 4 byte integer
cat 1 20 b
dog 1 30 a
C: 8 byte string
dog 1 40 c D: 5 byte string
bat 1 50 d

||R|| = 5 S(R) = 37
V(R,A) = 3 V(R,C) = 5
V(R,B) = 1 V(R,D) = 4
CS3223: Query Optimization 32
Size estimate for W = Z=val (R)
||R||
R A B C D V(R,A)=3 ||W|| =
V(R,Z)
cat 1 10 a V(R,B)=1
cat 1 20 b
V(R,C)=5 S(W) = S(R)
dog 1 30 a
dog 1 40 c V(R,D)=4
bat 1 50 d
Assumption:
Values in select expression Z = val are uniformly
distributed over possible V(R,Z) values
Alternative assumption: use DOM(R,Z)
CS3223: Query Optimization 33
What about W = z  val (R)?
Solution: Estimate values in range
R Z
Min=1 V(R,Z)=10

W= z  15 (R)
Max=20

20-15+1
f (fraction of range) = = 6 ||W|| = f  ||R||
20-1+1 20

Alternative: (Max(Z)-value)/(Max(Z)-Min(Z))
CS3223: Query Optimization 34
W = R1 R2
R1 A B C R2 A D

Assumption:
V(R1,A)  V(R2,A)  Every A value in R1 is in R2
V(R2,A)  V(R1,A)  Every A value in R2 is in R1

“containment of value sets”

CS3223: Query Optimization 35

Computing T(W) when V(R1,A)  V(R2,A)

R1 A B C R2 A D
Take
1 tuple Match

1 tuple of R1 matches with ||R2|| tuples of R2

V(R2,A)
||R2||
so ||W|| =||R1|| 
V(R2,A)
||R2||  ||R1||
If V(R2,A)  V(R1,A) ||W|| =
V(R1,A)
CS3223: Query Optimization 36
For complex expressions, need
intermediate T,S,V results
E.g. W = [A=a (R1) ] R2

Treat as relation U

||U|| = ||R1||/V(R1,A) S(U) = S(R1)

Also need V (U, *) !!

CS3223: Query Optimization 37
Example
R1 A B C D V(R1,A)=3
cat 1 10 10 V(R1,B)=1
cat 1 20 20 V(R1,C)=5
dog 1 30 10 V(R1,D)=3
dog 1 40 30 U =  A=a (R1)
bat 1 50 10
||R1||
V(U,A) = 1? V(U, B) = 1? (= V(R, B)) V(U,C) = ?
V(R1,A)

V(D,U) … somewhere in between V(U,B) and V(U,C)

CS3223: Query Optimization 38
For Joins U = R1(A,B) R2(A,C)
V(U,A) = min { V(R1, A), V(R2, A) }
V(U,B) = V(R1, B)
V(U,C) = V(R2, C)

(Assumption: Preservation of value sets)

CS3223: Query Optimization 39

Example

Z = R1(A,B) R2(B,C) R3(C,D)

R1 ||R1|| = 1000 V(R1,A)=50 V(R1,B)=100

R2 ||R2|| = 2000 V(R2,B)=200 V(R2,C)=300
R3 ||R3|| = 3000 V(R3,C)=90 V(R3,D)=500

CS3223: Query Optimization 40

Z = R1(A,B) R2(B,C) R3(C,D)
||R1|| = 1000 V(R1,A)=50 V(R1,B)=100
Partial Result: U = R1 R2 ||R2|| = 2000 V(R2,B)=200 V(R2,C)=300
||R3|| = 3000 V(R3,C)=90 V(R3,D)=500

||U|| = 10002000 V(U,A) = 50

200 V(U,B) = 100
V(U,C) = 300

Z=U R3
V(Z,A) = 50
||Z|| = 100020003000 V(Z,B) = 100
200300 V(Z,C) = 90
V(Z,D) = 500
CS3223: Query Optimization 41
Errors in Estimating Size of Plan
• Errors
• source include uniformity assumption, and inability to
capture correlation, accuracy of cost model, statistics, etc.
• propagated to other operators at the higher level of the plan R4
tree
R3
• Dealing with errors
R1 R2
• Maintain more detailed statistics (at finer granularity)
• During runtime, may need to sample the actual intermediate
results
• dynamic query optimization

CS3223: Query Optimization 42

Statistical Summaries of Data
• More detailed information are sometimes stored e.g.,
histograms of the values in some attributes
• a histogram divides the values on a column into k buckets
• k is predetermined or computed based on space allocation
• several choices for “bucketization’’ of values
• If a table has n records, an equi-depth histogram divides the set of
values on a column into k ranges such that each range has
approximately the same number of records, i.e., n/k
• Equi-width histogram – each bucket has (almost) equal number of
values
• Witihin each bucket, records are uniformly distributed across the range
of the bucket
• Frequently occurring values may be placed in singleton buckets
CS3223: Query Optimization 43
Histograms
15 records are uniformly
distributed

3.5
3.0
2.25

CS3223: Query Optimization 44

Estimations with Histograms

3.5
3.0
2.25

Query Q:  A=6 (R) Query Q:  A=10 (R)

Actual value, ||Q|| = 3 Actual value, ||Q|| = 0

Without histogram, ||Q|| = 45/15 = 3 Without histogram, ||Q|| = 45/15 = 3
Equiwidth histogram, ||Q|| = 15/3 = 5 Equiwidth histogram, ||Q|| = 1
Equidepth histogram, ||Q|| = 14/4 = 3.5 Equidepth histogram, ||Q|| = 1.75

CS3223: Query Optimization 45

Query Evaluation

CS3223: Query Optimization 46

Query Evaluation Approaches
• Materialization evaluation
• An operator is evaluated only when each of its operands
has been completely evaluated or materialized
• Intermediate results are materialized to disk
• Pipelining evaluation
• The output produced by an operator is passed directly to
its parent operator
• Execution of operators is interleaved

CS3223: Query Optimization 47

Materialization Evaluation

CS3223: Query Optimization 48

Pipeline Evaluation: Iterator Model
consumer
Open()

Open()

Open() Open()

CS3223: Query Optimization 49

Pipeline Evaluation: Iterator Model
consumer
GetNext() t

All operators are

GetNext() t
running simultaneously

GetNext() GetNext()
tm ta

CS3223: Query Optimization 50

Pipeline Evaluation with Partial Materialization

Plan 1
Plan 2
CS3223: Query Optimization 51
Randomized Techniques
• Employ randomized/transformation techniques for query
optimization
• State space -- space of plans, State -- plan
• Each state has a cost associated with it
• determined by some cost model
• A move is a perturbation applied to a state to get to another
state
• a move set is the set of moves available to go from one state to another
• any one move is chosen from this move set randomly
• each move set has a probability associated to indicate the probability of selecting the move
• Two states are neighboring states if one move suffices to go
from one state to the other
CS3223: Query Optimization 52
Randomized Algorithm (Example)
State/QEP 1
R4 R4 Neighbor
R3 R3 of QEP 1

R1 R2 R2 R1

A move

Neighbor R4 R4
of QEP 1
R3 R3
R1 R2 R2 R1

CS3223: Query Optimization 53

More on Randomized Techniques
• A local minimum in the state space is a state such that its
cost is lower than that of all neighboring states
• A global minimum is a state which has the lowest cost
among all local minima
• at most one global minimum
• A move that takes one state to another state with a lower
cost is called a downward move; otherwise it is an upward
move
• in a local/global minimum, all moves are upward moves

CS3223: Query Optimization 54

Local Optimization
Repeat until
a near-optimal S = initialize() // initial plan
minimum
Is reached
minS = S // cost of plan S – currently the best
repeat {
repeat {
By doing so newS = move(S) // move to a new plan
repeatedly,
a local minimum if (cost(newS) < cost(S)) A move is accepted if it is a
downward move, i.e., has
can S = newS a lower cost
be reached } until (“local minimum reached”)
if (cost(S) < cost(minS))
minS = S
newStart(S); // iterate with a different initial plan
} until (“stopping condition satisfied”)
return (minS);
CS3223: Query Optimization 55
Issues on Local Optimization
• How is the start state obtained?
Run: sequence of
• The state in which we start a run
moves to a local
• The start state of the first run is the initial state minimum from the
• All start states should be different start state
• Should be obtained quickly
• Random
• greedy heuristics
• making a number of moves from the local minimum, except that this time each move
is accepted irrespective of whether it increases or decreases the cost
• How is the local minimum detected? need to move to a plan with a higher
cost in order to jump out of the local
minimum
• How is the stopping criterion detected?

CS3223: Query Optimization 56

Issues on Local Optimization (Cont)
• How is the local minimum detected?
• Not practical to examine all neighbors to verify that one has
reached a local minimum
• Based on random sampling
• examine a sufficiently large number of neighbors
• if any one is lower, we move to that state, and repeat the process
• if no tested neighbor is of lower cost, the current state can be considered a local
minimum
• the number of neighbors to examine can be specified as a parameter,
and is called the sequence length
• Can also be time-based

CS3223: Query Optimization 57

Issues in Local Optimization (Cont)

• How is the stopping criterion detected?

• Determines the number of times that the outer loop is
executed
• Can be fixed and is given by sizeFactor*N, where
sizeFactor is a parameter, N is the number of relations
• Why N? Can it be a constant?
if you have a 4 table join you might not need 500 initial plans.
if you have a 500 table join, 100 initial plans likely will not find an optimal plan

CS3223: Query Optimization 58

What about the MOVEs?
Transformation Rules
• Restricted to left-deep trees
• all possible permutations of the N relations
• let S be the current state, S = (… i … j … k …)
• swap
• select two relations, say i and j at random. Swapped i and j to get the new
state newS = ( … j … i … k … ) might result in a cross product being removed, hence
reducing the cost
• 3Cycle
• select three relations, say i and j and k at random. The move consists of
cycling i, j and k: i is moved to the position of j, j is moved to the position of k
and k is moved to the position of i. The resultant new state newS = ( … k …
i…j…)
• Other methods (e.g., join methods)? Bushy trees?

CS3223: Query Optimization 59

Comparison between Exhaustive,
Greedy and Randomized Algorithms
randomised algorithm might have high overhead
if it keeps repeating the same few plans.
• Search space
• Plan quality
• Optimization overhead

CS3223: Query Optimization 60

CS3223: Query Optimization 61

Dynamic Programming (Cont)
{}

{1} {2} {3} {4}

{1 2} {1 3} {1 4} {2 3} {2 4} {3 4}

{1 2 3} {1 2 4} {2 3 4} {1 3 4}

{1 2 3 4}
CS3223: Query Optimization 62
Dynamic Programming (Left-Deep Trees)
(System R)
• The algorithm proceeds by considering increasingly larger
subsets of the set of all relations
• Builds a plan bottom-up (beginning from 1 table, then to 2, and so on)
• Plans for a set of cardinality i are constructed as extensions of the best
plan for a set of cardinality i-1
• Keep only ONE best plan for each set of cardinality i
• Search space can be pruned based on the principal of optimality
• if two plans differ only in a subplan, then the plan with the better subplan is
also the better plan

CS3223: Query Optimization 63

Principle of Optimality
{}

{1} {2} {3} {4}

{1 2} {1 3} {1 4} {2 3} {2 4} {3 4}

{1 2 3} {1 2 4} {2 3 4} {1 3 4}

{1 2 3 4}

CS3223: Query Optimization 64

Dynamic Programming (Left-Deep Trees)
(System R)
• The algorithm proceeds by considering increasingly larger subsets of
the set of all relations
• Builds a plan bottom-up (beginning from 1 table, then to 2, and so on)
• Plans for a set of cardinality i are constructed as extensions of the best plan for a set of
cardinality i-1
• Keep only ONE best plan for each set of cardinality i
• Search space can be pruned based on the principal of optimality
• if two plans differ only in a subplan, then the plan with the better subplan is also the better
plan
• Computation overhead reduced due to overlapping subproblems
• Multiple sets of cardinality i uses same set at cardinality (i-1)

CS3223: Query Optimization 65

Dynamic Programming (Left-Deep Trees)
• accessPlan(R) produces the best plan for relation (single
table) R
• joinPlan(p1,R) extends the (partial) join plan p1 into another
plan p2 in which the result of p1 is joined with R in the best
possible way
• p1 = R1 JOIN R2 JOIN R3
• p2 = joinPlan(p1, R) = (R1 JOIN R2 JOIN R3) JOIN R4
• Optimal plans for subsets are stored in optplan() array and
are reused rather than recomputed
CS3223: Query Optimization 66
Dynamic Programming (Cont)
for i = 1 to N
optPlan({Ri}) = accessPlan(Ri)
for i = 2 to N {
forall S subset of {R1, R2, … Rn} such that |S|=i {
bestPlan = dummy plan with infinite cost
forall Rj, Sj, |Sj| = i-1 such that S = {Rj} U Sj {
p = joinPlan(optPlan(Sj), Rj) {}

if cost(p) < cost(bestPlan) {1} {2} {3} {4}

bestPlan = p
} {1 2} {1 3} {1 4} {2 3} {2 4} {3 4}

optPlan(S) = bestPlan
} {1 2 3} {1 2 4} {2 3 4} {1 3 4}

}
Popt = optPlan{R1, R2, … Rn} {1 2 3 4}

CS3223: Query Optimization 67

Dynamic Programming: A Concrete Example

CS3223: Query Optimization 68

Enumeration of Single-relation Plans

CS3223: Query Optimization 69

Enumeration of Two-Relation Plans
s and t not considered because we want to avoid cross products

CS3223: Query Optimization 70

Enumeration of Three-Relation Plans

CS3223: Query Optimization 71

Optimal Plan

CS3223: Query Optimization 72

Dynamic Programming (Cont)
• Time & Space complexity
• For k relations, for left-deep trees, 2k – 1 entries!
DP is not used when k exceeds some value
• For bushy trees, O(3k) (about 15-20)

• Is DP (as presented) optimal?

not optimal if we also consider bushy trees, since DP doesnt include bushy trees

not optimal even within the subset of left-deep trees

CS3223: Query Optimization 73

Dynamic Programming (Cont)
• Time & Space complexity
• For k relations, for left-deep trees, 2k – 1 entries!
• For bushy trees, O(3k)
• Is DP optimal?
• DP may maintain multiple plans per subset of
relations
plans with properties that are beneficial in the future
• Interesting orders might be pruned away because they are not optimal now

plans that are sorted with a certain ordering, for example.

queries that have ordering
and group by. we might want to
keep plans that result in these
relations being sorted. more computataion overhead.

CS3223: Query Optimization 74

CS3223: Query Optimization 75
Dynamic Programming (Cont)
• Time & Space complexity
• For k relations, for left-deep trees, 2k – 1 entries!
• For bushy trees, O(3k)
• Is DP optimal?
• DP may maintain multiple plans per subset of
relations
• Interesting orders
• Is DP with interesting orders optimal?
gives the optimal query with respect to total cost, but not for other
cost models. but will still be a good solution.
CS3223: Query Optimization 76
Summary
• Query optimization is NP-hard
• Instead of finding the best, the objective is largely to
avoid the bad plans
• Many different optimization strategies have been
proposed
• greedy heuristics are fast but may generate plans that are
far from optimal
• dynamic programming is effective at the expense of high
optimization overhead

CS3223: Query Optimization 77

Query Optimization in CS186
No ratings yet
Query Optimization in CS186
51 pages
Lec 14
No ratings yet
Lec 14
26 pages
Relational Operators
No ratings yet
Relational Operators
114 pages
Relational Query Optimization Guide
No ratings yet
Relational Query Optimization Guide
71 pages
Query Optimization
No ratings yet
Query Optimization
51 pages
CAS CS 460/660 Introduction To Database Systems Query Optimization
No ratings yet
CAS CS 460/660 Introduction To Database Systems Query Optimization
20 pages
Slides 12
No ratings yet
Slides 12
24 pages
Lec 17
No ratings yet
Lec 17
24 pages
Lecture11 Query Processing
No ratings yet
Lecture11 Query Processing
37 pages
12 Query Plan Space
No ratings yet
12 Query Plan Space
72 pages
QueryProcess Optim
No ratings yet
QueryProcess Optim
60 pages
Query Evaluation
No ratings yet
Query Evaluation
51 pages
Query Processing Query Optimization
No ratings yet
Query Processing Query Optimization
4 pages
Relational Algebra Optimization
No ratings yet
Relational Algebra Optimization
24 pages
Overview of Query Evaluation: R&G Chapter 12
No ratings yet
Overview of Query Evaluation: R&G Chapter 12
30 pages
UEU Basis Data Pertemuan 14
No ratings yet
UEU Basis Data Pertemuan 14
32 pages
Query Optimization Techniques by Warih Maharani
No ratings yet
Query Optimization Techniques by Warih Maharani
39 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Relational Query Optimization Guide
No ratings yet
Relational Query Optimization Guide
7 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Unit 3
No ratings yet
Unit 3
18 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
45 pages
DBMS Unit5 Lecture1
No ratings yet
DBMS Unit5 Lecture1
22 pages
CH 02
No ratings yet
CH 02
127 pages
CSE 444: Database Internals: Section 4: Query Optimizer
No ratings yet
CSE 444: Database Internals: Section 4: Query Optimizer
16 pages
Execution
No ratings yet
Execution
37 pages
Database Query Processing & Security
No ratings yet
Database Query Processing & Security
39 pages
Zyqwadawfafslecture09 Query Optimization
No ratings yet
Zyqwadawfafslecture09 Query Optimization
90 pages
Query Optimization
No ratings yet
Query Optimization
20 pages
Lecture Notes
No ratings yet
Lecture Notes
96 pages
Introduction To Database Systems CSE 344: Basic Query Evaluation and Indexes
No ratings yet
Introduction To Database Systems CSE 344: Basic Query Evaluation and Indexes
38 pages
06 Query Processing (2) - NDN
No ratings yet
06 Query Processing (2) - NDN
31 pages
13 QP1
No ratings yet
13 QP1
33 pages
Unit 3
No ratings yet
Unit 3
63 pages
Lec6 QP Indexing
No ratings yet
Lec6 QP Indexing
40 pages
Query Optimization
No ratings yet
Query Optimization
103 pages
Query Optimization (Research Paper)
No ratings yet
Query Optimization (Research Paper)
10 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
DBMS Query Optimization Guide
No ratings yet
DBMS Query Optimization Guide
24 pages
Query Proc Notes
No ratings yet
Query Proc Notes
10 pages
SOEN 363 - Data Systems For Software Engineers: Query Optimization
No ratings yet
SOEN 363 - Data Systems For Software Engineers: Query Optimization
25 pages
Adb ch2
No ratings yet
Adb ch2
72 pages
Advanced Database Systems Lecture Notes
No ratings yet
Advanced Database Systems Lecture Notes
79 pages
Introduction To Database Management Systems CS470
No ratings yet
Introduction To Database Management Systems CS470
11 pages
Query Processing and Optimisation - Intr
No ratings yet
Query Processing and Optimisation - Intr
41 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
Query Optimization Practice Problems
No ratings yet
Query Optimization Practice Problems
7 pages
Query Optimization
No ratings yet
Query Optimization
29 pages
DBMS IMPORTANT UNIT-4 QUESTIONS and Answer
No ratings yet
DBMS IMPORTANT UNIT-4 QUESTIONS and Answer
5 pages
Query Processing: Cost and Optimization
No ratings yet
Query Processing: Cost and Optimization
25 pages
Chapter 2 - Query Processing and Optimization
100% (3)
Chapter 2 - Query Processing and Optimization
28 pages
Database Query Optimization Guide
No ratings yet
Database Query Optimization Guide
127 pages
Query Optimization Techniques Overview
No ratings yet
Query Optimization Techniques Overview
45 pages
ADBMS Chapter 1
No ratings yet
ADBMS Chapter 1
47 pages
CSE 444 Practice Problems
No ratings yet
CSE 444 Practice Problems
8 pages
Energy-Efficient Wireless Design
No ratings yet
Energy-Efficient Wireless Design
1 page
Gams Tutorial
No ratings yet
Gams Tutorial
90 pages
Technical Program - ICOA2025
No ratings yet
Technical Program - ICOA2025
8 pages
Indefinite Integration in Engineering Math
No ratings yet
Indefinite Integration in Engineering Math
7 pages
Earthworks and Pavement Guide
No ratings yet
Earthworks and Pavement Guide
5 pages
PC142
No ratings yet
PC142
2 pages
Final Detailed Fermenter Design
100% (1)
Final Detailed Fermenter Design
24 pages
ZTE KPI's
No ratings yet
ZTE KPI's
3 pages
Class X - PPT - Part 7 of 8 - Chapter2 - Acids, Bases and Salts
No ratings yet
Class X - PPT - Part 7 of 8 - Chapter2 - Acids, Bases and Salts
21 pages
Python Notes
No ratings yet
Python Notes
19 pages
AC - H1 Math 2014
No ratings yet
AC - H1 Math 2014
8 pages
Biotransformation of Cedrol by Curvularia Lunata ATCC 12017: Dwight O. Collins, Paul B. Reese
No ratings yet
Biotransformation of Cedrol by Curvularia Lunata ATCC 12017: Dwight O. Collins, Paul B. Reese
5 pages
Basic Electronics: Dr. Prasanta Kumar Guha
No ratings yet
Basic Electronics: Dr. Prasanta Kumar Guha
39 pages
Improve Energy Efficiency in Induction Melting
No ratings yet
Improve Energy Efficiency in Induction Melting
13 pages
Lecture 3 - Virtual Work For Indeterminate Truss
No ratings yet
Lecture 3 - Virtual Work For Indeterminate Truss
8 pages
A MODIFIED PENG ROBINSON EQUATION OF STATE FOR ELV 20519 FTP PDF
No ratings yet
A MODIFIED PENG ROBINSON EQUATION OF STATE FOR ELV 20519 FTP PDF
16 pages
Technical Manual Motor 4G52
No ratings yet
Technical Manual Motor 4G52
10 pages
Genetic Algorithm Coin Toss Lab Report
No ratings yet
Genetic Algorithm Coin Toss Lab Report
9 pages
CLR Using C#
No ratings yet
CLR Using C#
14 pages
Aerospace Material Specification
No ratings yet
Aerospace Material Specification
6 pages
Oil Quality Testing Standards Summary
No ratings yet
Oil Quality Testing Standards Summary
1 page
Refractive Index of Ethyl Alcohol Experiment
No ratings yet
Refractive Index of Ethyl Alcohol Experiment
11 pages
Notes On Reinforced Concrete To BS 5400 Part 4
100% (2)
Notes On Reinforced Concrete To BS 5400 Part 4
35 pages
Renal MR and CT Angiography Guide
100% (1)
Renal MR and CT Angiography Guide
14 pages
Power System Protection Guide
No ratings yet
Power System Protection Guide
42 pages
VTE Mathematics
No ratings yet
VTE Mathematics
6 pages
Nozzle and Diffuser D
No ratings yet
Nozzle and Diffuser D
35 pages
Design and Installation of Suctıon Anchor Piles
No ratings yet
Design and Installation of Suctıon Anchor Piles
13 pages
FDMEE Data Loading in Oracle Cloud
No ratings yet
FDMEE Data Loading in Oracle Cloud
28 pages
Descargar Siete Habitaciones A Oscuras PDF
No ratings yet
Descargar Siete Habitaciones A Oscuras PDF
3 pages