Ami Levin, SolidQ
Presented to the Silicon Valley SQL Server User Group, April 2013
Nesting Merged Hash Loops
Ami Levin
CTO, DBSophic
SQL Server
Physical Join Operators
Session Goals
SQL Server uses three physical join operators:
Nested loops, Merge, and Hash Match.
In this session we will:
• See how each of these operators work
• Review their advantages and drawbacks
• Understand some of the logic behind the
optimizer’s decisions on which operator to use
• Learn to identify common join-related pitfalls
2
Not This Time
• Outer joins
• Non equi-joins
• Logical processing order
• NULL issues
• Join parallelism
• Partitioned joins
• …
3
Equi-Inner-Join
SELECT Foo, Bar, ...
FROM T1 INNER JOIN T2
ON T1.C1 = T2.C1
AND T1.C2 = T2.C2
AND ...
WHERE ...
4
Visual Join Simulator
5
Nested Loops
6
Fetch next row
from blue input
Row
exists
Quit
Find matching
rows in red input
True
False
Start
?
Nested Loops I
• Outer loop determines number of iterations
• At least one input should be (relatively) small
• Inner operation is performed for every
iteration of the outer loop
• Index or table scan (naïve)
• Index seek + lookup
• Covering index seek
• Index spool
7
• Data pages may be accessed repeatedly
• Risky a-sequential page access path
• Output of matching row sets is fast
• Unordered, but typically grouped
• Physical resources
• CPU Very low
• Physical IO low to very high
• Memory low
Nested Loops II
8
Nested Loops
with Foreign Key Joins
• Foreign keys join parent and child
• Most common relationship is one-to-many
• Often parent input is significantly smaller
• Parent must already be indexed
• Either primary key or unique constraint
• Therefore, indexing foreign keys often
enables efficient use of nested loops
9
Nested Loops
10
Merge
11
Fetch next row
from blue input
Row
exists
Quit
Fetch next row
from red input
True
False
Start
Rows
matchTrue
False
? ?
Merge I
• Inputs must be sorted prior to merge
• Sorted by (all?) join expression(s)
• Pre-sorted in plan, but not necessarily in DB
• Preferred when sorting supports additional
plan operations
• Merge join types
• One to many
• Many to many - requires temporary worktable
12
Merge II
• Residual predicates
• Fast, ordered and grouped output
• Physical resources
• CPU Very low
• Physical IO Very low
• Memory Very low
• * Excluding sorting costs
13
Merge
14
Hash Match - Phase I (Build)
15
Fetch next row
from blue input
Row
exists
Phase II
Apply hash
function
True
False
Start
?
Hash Match - Phase II (Probe)
16
Fetch next row
from red input
Row
exists
Quit
Apply hash
function
True
False
Phase I
?
• Hash function selection
• Extremely complex
• CPU intensive
• Build and probe costs are hidden
• Do not constitute logical reads
• Output of matching row sets is slow
• Unordered and typically ungrouped
Hash Match I
17
• In memory hash join
Grace hash join
Recursive hash join
• Hash bailout
• Hash warnings event class
• Update Statistics
• Add more RAM
• Role reversal
Hash Match II
18
Hash Match III
• May indicate sub-optimal indexing
• Best for very large, non covered joins
• Physical resources
• CPU Very high
• Physical IO Low to very high
• Memory Very high
19
Hash Match
20
Summary
21
Nested Loops Merge Hash
Good when
Small outer input
Inner input indexed
Pre-sorted inputs
Sorting needed
Very large inputs
Not well indexed
CPU Low
Low
* Excluding sorting
High
Memory Low
Low
* Excluding sorting
High
Physical IO Low / High Low Low / High
Logical reads High Low
Low
* Misleading
Output
Fast, unordered,
grouped*
Fast, ordered,
grouped
Slow, unordered,
ungrouped*
For More Information
• Books on line
• White papers
• “Inside Microsoft SQL server” books
• Craig Freedman’s blog
• http://blogs.msdn.com/craigfr/about.aspx
22
Physical Join Operators
23
Complete the Evaluation Form
to Win!
Win a Dell Mini Netbook – every day – just for handing
in your completed form. Each session evaluation form
represents a chance to win.
Pick up your evaluation form:
• In each presentation room
• Online on the PASS Summit website
Drop off your completed form:
• Near the exit of each presentation room
• At the Registration desk
• Online on the PASS Summit website
Sponsored by Dell
24
Thank you
Ami Levin, SolidQ

Microsoft SQL Server Physical Join Operators

  • 1.
    Ami Levin, SolidQ Presentedto the Silicon Valley SQL Server User Group, April 2013 Nesting Merged Hash Loops Ami Levin CTO, DBSophic SQL Server Physical Join Operators
  • 2.
    Session Goals SQL Serveruses three physical join operators: Nested loops, Merge, and Hash Match. In this session we will: • See how each of these operators work • Review their advantages and drawbacks • Understand some of the logic behind the optimizer’s decisions on which operator to use • Learn to identify common join-related pitfalls 2
  • 3.
    Not This Time •Outer joins • Non equi-joins • Logical processing order • NULL issues • Join parallelism • Partitioned joins • … 3
  • 4.
    Equi-Inner-Join SELECT Foo, Bar,... FROM T1 INNER JOIN T2 ON T1.C1 = T2.C1 AND T1.C2 = T2.C2 AND ... WHERE ... 4
  • 5.
  • 6.
    Nested Loops 6 Fetch nextrow from blue input Row exists Quit Find matching rows in red input True False Start ?
  • 7.
    Nested Loops I •Outer loop determines number of iterations • At least one input should be (relatively) small • Inner operation is performed for every iteration of the outer loop • Index or table scan (naïve) • Index seek + lookup • Covering index seek • Index spool 7
  • 8.
    • Data pagesmay be accessed repeatedly • Risky a-sequential page access path • Output of matching row sets is fast • Unordered, but typically grouped • Physical resources • CPU Very low • Physical IO low to very high • Memory low Nested Loops II 8
  • 9.
    Nested Loops with ForeignKey Joins • Foreign keys join parent and child • Most common relationship is one-to-many • Often parent input is significantly smaller • Parent must already be indexed • Either primary key or unique constraint • Therefore, indexing foreign keys often enables efficient use of nested loops 9
  • 10.
  • 11.
    Merge 11 Fetch next row fromblue input Row exists Quit Fetch next row from red input True False Start Rows matchTrue False ? ?
  • 12.
    Merge I • Inputsmust be sorted prior to merge • Sorted by (all?) join expression(s) • Pre-sorted in plan, but not necessarily in DB • Preferred when sorting supports additional plan operations • Merge join types • One to many • Many to many - requires temporary worktable 12
  • 13.
    Merge II • Residualpredicates • Fast, ordered and grouped output • Physical resources • CPU Very low • Physical IO Very low • Memory Very low • * Excluding sorting costs 13
  • 14.
  • 15.
    Hash Match -Phase I (Build) 15 Fetch next row from blue input Row exists Phase II Apply hash function True False Start ?
  • 16.
    Hash Match -Phase II (Probe) 16 Fetch next row from red input Row exists Quit Apply hash function True False Phase I ?
  • 17.
    • Hash functionselection • Extremely complex • CPU intensive • Build and probe costs are hidden • Do not constitute logical reads • Output of matching row sets is slow • Unordered and typically ungrouped Hash Match I 17
  • 18.
    • In memoryhash join Grace hash join Recursive hash join • Hash bailout • Hash warnings event class • Update Statistics • Add more RAM • Role reversal Hash Match II 18
  • 19.
    Hash Match III •May indicate sub-optimal indexing • Best for very large, non covered joins • Physical resources • CPU Very high • Physical IO Low to very high • Memory Very high 19
  • 20.
  • 21.
    Summary 21 Nested Loops MergeHash Good when Small outer input Inner input indexed Pre-sorted inputs Sorting needed Very large inputs Not well indexed CPU Low Low * Excluding sorting High Memory Low Low * Excluding sorting High Physical IO Low / High Low Low / High Logical reads High Low Low * Misleading Output Fast, unordered, grouped* Fast, ordered, grouped Slow, unordered, ungrouped*
  • 22.
    For More Information •Books on line • White papers • “Inside Microsoft SQL server” books • Craig Freedman’s blog • http://blogs.msdn.com/craigfr/about.aspx 22
  • 23.
  • 24.
    Complete the EvaluationForm to Win! Win a Dell Mini Netbook – every day – just for handing in your completed form. Each session evaluation form represents a chance to win. Pick up your evaluation form: • In each presentation room • Online on the PASS Summit website Drop off your completed form: • Near the exit of each presentation room • At the Registration desk • Online on the PASS Summit website Sponsored by Dell 24
  • 25.