Part II : Waits Events
and the
Geeks who love them
Kyle Hailey
http://perfvision.com
Wait Events
Wait Events
Copyright 2006 Kyle Hailey
And the Geeks Who Love Them
Copyright 2006 Kyle Hailey
In this Section:
Introduction to Waits
Tuning Methodology
Plan of Action
Statspacks, AWR or OEM for Collection
Data
Based on Waits
Using Waits to Solve Bottlenecks
Copyright 2006 Kyle Hailey
Database is Hung!
Everybody blames the database
Yet 9 out of 10 dba’s agree it’s not the
database
How do you prove it to management?
On the off chance it’s the database, what do
we do?
Copyright 2006 Kyle Hailey
Database:
Guilty until proven innocent
*$%@!!
Copyright 2006 Kyle Hailey
Oracle Instrumentation
CPU Locks
Redo Lib Buffer
Cache Cache
Network
IO
Copyright 2006 Kyle Hailey
Wait Areas
Buffer Cache
I/O
Locks
Waits Library Cache
Redo
SQL*Net
We’ll discuss Waits in these logical database areas
Copyright 2006 Kyle Hailey
Wait Tree Write IO
Read IO
Rollback
Buffer Busy Free lists
IO
Cache Latches IO Read
Buffer Cache
Library Cache
Library Cache
Shared Pool
Wait
s Lock
TX Row Lock
Redo TX ITL Lock
SQL Net HW Lock
Log File
Log Buffer
Copyright 2006 Kyle Hailey
Log File Sync
Waits
Introduced in v7
Revolutionized tuning
Changed from Ratio Guesswork to empirical
measure of time lost to bottlenecks
10g added the crucial addition ASH
Not only identifies bottlenecks but
Who (session, service, package, procedure)
Where (CPU, Wait)
When (time)
What (SQL statement)
Copyright 2006 Kyle Hailey
Tuning Methodology
1. Machine
Run queue (CPU)
Check other applications
reduce CPU usage or add CPUs
Paging
Reduce memory usage or add memory
2. Oracle
Waits + CPU > Available CPU
Tune waits
CPU 100% We are going to
Tune SQL concentrate here
Else low waits, available CPU then on WAITS
It’s the application
Copyright 2006 Kyle Hailey
Dependable Tuning Strategy
Determine AAS :
1. Run Statspack or AWR Report
Top 5 Timed Events
~50 lines down from top
Need Available CPU
Elapsed Time
CPU_COUNT
2. ASH Report : ashrpt.sql
3. OEM 10g
Performance Page does everything
If there is a wait bottleneck tune the wait
Copyright 2006 Kyle Hailey
Tuning Methodology Graphics
Relax, it’s the Get to Work!
application
Copyright 2006 Kyle Hailey
Waits beyond OEM
OEM identifies Wait problems
Provides solutions with ADDM sometimes
But
What do you do when ADDM isn’t sufficient?
What do you do if you don’t have OEM 10g?
Waits
Need to know about waits
How they work
How to analyze them
Copyright 2006 Kyle Hailey
v$active_session_history
When ADDM fails or we don’t have ADDM we can
collect the necessary information from
v$active_session_history
Session (user, service, client, package, procedure, etc)
SQL statement
For IO waits, buffer busy waits and enqueue TX waits
CURRENT_OBJ# ,CURRENT_FILE# ,CURRENT_BLOCK#
Blocking_Session
P1
P2
P3
Copyright 2006 Kyle Hailey
What are P1,P2,P3 ?
Each Wait has a 3 parameters P1,P2,P3
Give detailed information
Meaning different for each wait
Meaning definitions in V$event_name
col parameter1 for a10
col parameter2 for a10
col parameter3 for a10
select parameter1 ,parameter2 , parameter3
from v$event_name
where name = '&1';
Copyright 2006 Kyle Hailey
Wait Arguments Example
select parameter1 ,parameter2 , parameter3
from v$event_name;
NAME PARAMETER1 PARAMETER2 PARAMETER3
------------------------------ ----------- --------------- ---------------
latch: cache buffers chains address number tries
free buffer waits file# block# set-id#
buffer busy waits file# block# class#
latch: redo copy address number tries
log buffer space
switch logfile command
log file sync buffer#
db file sequential read file# block# blocks
enq: TM - contention name|mode object # table/partition
undo segment extension segment#
enq: TX - row lock contention name|mode usn<<16 | slot sequence
row cache lock cache id mode request
library cache pin handle address pin address 100*mode+namesp
library cache load lock object address lock address 100*mask+namesp
pipe put handle address record length timeout
Copyright 2006 Kyle Hailey
Wait Analysis requires p1,p2,p3
Of the top 30 wait events 8 can be solved
without ASH
free buffer waits
log buffer space
log file switch (archiving needed)
log file switch (checkpoint incomplete)
log file switch completion
log file sync
switch logfile command
write complete waits
The rest need Example “hard” waits
SQL Buffer busy wait
Row cache lock
Latch free
P1,P2,P3 row lock contention
Statspack , AWR fail Latch: cache buffers chains
Copyright 2006 Kyle Hailey
Wait Analysis
SQL
Most often the tuning answer lies in looking at what the application is
doing, and changing it
Parameters
Find extended wait information
Parameter1, Parameter2, Parameter3
Defined in v$event_name
Guess Work
Sometimes the wait events that are found are not in the
documentation and it takes some educated guesswork to figure out
the problem
Difficult Waits
These 4 waits have multiple causes
Latches
p2 = latch # (p1= address, p3= tries)
Locks
p1 = lock type and mode ( p2 = id1, p3= id2)
Buffer Busy
p3 = block class#, p1= file, p2=block
(in 9i p3 was the bbw type)
Row Cache Lock
p1 = cache id (p2 = mode, p3=request)
Copyright 2006 Kyle Hailey
Example ASH Query
Select ash.p1,
ash.p2,
CURRENT_OBJ#||' '||o.object_name objn,
o.object_type otype,
CURRENT_FILE# filen,
CURRENT_BLOCK# blockn,
ash.SQL_ID,
w.class ||' '||to_char(ash.p3) block_type
from v$active_session_history ash,
( select rownum class#, class from v$waitstat ) w,
all_objects o
where event='buffer busy waits'
and w.class#(+)=ash.p3
and o.object_id (+)= ash.CURRENT_OBJ#
and ash.session_state='WAITING'
and ash.sample_time > sysdate - &1/(60*24)
Order
P1 by sample_time
P2 OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPE
1 112796 66053 BBW_INDEX_VAL_I INDEX 1 112796 6avm49ys4k7t6 data block 1
1 112401 66053 BBW_INDEX_VAL_I INDEX 1 112401 5wqps1quuxqr4 data block 1
1 112796 66053 BBW_INDEX_VAL_I INDEX 1 112796 5wqps1quuxqr4 data block 1
1 113523 66053 BBW_INDEX_VAL_I INDEX 1 113523 5wqps1quuxqr4 data block 1
Copyright 2006 Kyle Hailey
Waits we will Ignore
One thing that makes waits difficult is knowing which
ones to look at and which ones to ignore.
Background
Idle
Resource Manager
Parallel Query
RAC
Good stuff, but not covered in this seminar
Copyright 2006 Kyle Hailey
Background & Foreground
Background Processes
DBWR
LGWR
PMON
SMON
Etc
Foreground Processes
SQL*Plus
Pro*C
SQL*Forms
Oracle applications
Only interested in Foreground waits
Copyright 2006 Kyle Hailey
Background Waits
ASH
Avoid Background waits in ASH with
Select
Select …from
…from v$active_session_history
v$active_session_history
where
where SESSION_TYPE='FOREGROUND'
SESSION_TYPE='FOREGROUND'
V$session_wait joined to v$session
select
select ……
from
from v$session
v$session s,
s,
v$session_wait
v$session_wait w
w
where
where w.sid=s.sid
w.sid=s.sid
and
ands.type='USER'
s.type='USER'
Copyright 2006 Kyle Hailey
Idle Waits
Filtered Out of ASH by default
10g
where wait_class != ‘Idle’
Create a list
Select
Select name
name from
from v$event_name
v$event_name where
where
wait_class=‘Idle’;
wait_class=‘Idle’;
9i
Create a list with
Documentation
List created from 10g
Stats$idle_events from statspack
SQL*Net message from client
Copyright 2006 Kyle Hailey
PQO and Resource Manager
Resource manager throttles user
Createswait
Obfuscates problems
select
select name
name from
from v$event_name
v$event_name where
where
wait_class='Scheduler';
wait_class='Scheduler';
Parallel Query Wait events are unusable
Save waits are both idle and waits
Parallel Query Waits start with ‘PX’ or ‘KX’
PX Deq: Par Recov Reply
PX Deq: Parse Reply
Copyright 2006 Kyle Hailey
RAC Waits
RAC waits are certainly interesting but will be covered
outside of this presentation.
You are on your own
Check documentation
If you are not using RAC then no worries
10g
Select
Select event
event from
from v$event_name
v$event_name where
where
wait_class=‘Cluster’;
wait_class=‘Cluster’;
9i
RAC and OPS waits usually contain the word “global”
Copyright 2006 Kyle Hailey
Latches
Protect areas of memory from concurrent use
Light weight locks
Bitin memory
Atomic processor call
Fast and cheap
Gone if memory is lost
Often used in cache coherency management
Changes to a data block
Exclusive Generally
Sharing reading has been introduced for some latches
Copyright 2006 Kyle Hailey
Finding Latches
“latch free”
Covers many latches, find the problem latch by
1. select name from v$latchname where latch# = p1;
OR
2. Find highest sleeps in Statspack latch section
In 10g, important latches have a wait event
latch: cache buffers chains
latch: shared pool
latch: library cache Latches avoid:
• deallocating while someone is accessing
• reading while someone is modifying
• modifing while someone is modifying
• modifing while someone is reading
Copyright 2006 Kyle Hailey
Latches and Spin Count
Latch contention is a symptom of a problem
Spin Count
Band aid
Doesn’t solve the problem
If there is free CPU
Can increase spin count
Underscore parameter in 10g (not recommended to change)
Fix the problem
Instead of using spin count
Find the problem and fix it
Copyright 2006 Kyle Hailey
Enqueues aka Locks
“Enqueue” wait – covers all locks pre 10
Protect data against concurrent changes
Lock info written into data structures
Block headers
Data blocks
Written in cache structures
Shareable in compatible modes
Copyright 2006 Kyle Hailey
Locks 10g
10g breaks all Enqueues out
enq: HW - contention Configuration
enq: TM - contention Application
enq: TX - allocate ITL entry Configuration
enq: TX - index contention Concurrency
enq: TX - row lock contention Application
enq: UL - contention Application
Copyright 2006 Kyle Hailey
Row Cache Lock
Need p1 to see the cache type
SQL>
SQL> select
select cache#,
cache#, parameter
parameter from
from v$rowcache;
v$rowcache;
CACHE#
CACHE# PARAMETER
PARAMETER
----------
---------- --------------------------------
--------------------------------
11 dc_free_extents
dc_free_extents
44 dc_used_extents
dc_used_extents
22 dc_segments
dc_segments
00 dc_tablespaces
dc_tablespaces
55 dc_tablespace_quotas
dc_tablespace_quotas
66 dc_files
dc_files
77 dc_users
dc_users
33 dc_rollback_segments
dc_rollback_segments
88 dc_objects
dc_objects
17
17 dc_global_oids
dc_global_oids
12
12 dc_constraints
dc_constraints
Copyright 2006 Kyle Hailey
Additional Support
AWR Tables – on disk for 7 days by default
DBA_HIST_ACTIVE_SESS_HISTORY
1 in 10 ASH samples
DBA_HIST_SEG_STAT
Good for ITL and buffer busy wait
DBA_HIST_SYSTEM_EVENT
Important for getting avg wait times
DBA_HIST_SQLSTAT
sql execution deltas
DBA_HIST_SYSMETRIC_SUMMARY
Statistics avg, max, min
Metric Tables – in memory deltas
V$EVENTMETRIC
Copyright 2006 Kyle Hailey
All Events over 7 days
select count(*), event from
( select event from DBA_HIST_ACTIVE_SESS_HISTORY
where sample_time < ( select min(sample_time) from
v$active_session_history)
union all
select event from v$active_session_history
)
group by event
order by event
/
Copyright 2006 Kyle Hailey
Average Wait Times Historic
select
btime,
(time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0) avg_ms
from (
select
to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI') btime,
total_waits count_end,
time_waited_micro/1000 time_ms_end,
Lag (e.time_waited_micro/1000)
(
OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg,
Lag (e.total_waits)
(
OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg
from
DBA_HIST_SYSTEM_EVENT e,
DBA_HIST_SNAPSHOT s
where BTIME AVG_MS
-------------------- ------------
s.snap_id=e.snap_id
and e.event_name= '&1' 08-JAN-08 01:00 1.017
order by begin_interval_time 08-JAN-08 02:00 .720
) 08-JAN-08 03:00 .621
order by btime; 08-JAN-08 04:00 1.747
08-JAN-08 05:00 1.046
08-JAN-08 06:00 1.444
Copyright 2006 Kyle Hailey
Avg Wait times now
select
en.name,
(time_waited)/nullif(wait_count,0) avg_ms,
wait_count
from
v$eventmetric e,
v$event_name en
where
e.event# = en.event#
and en.name like '%&1%‘;
NAME AVG_MS WAIT_COUNT
db file sequential read .658863707 6420
db file scattered read .549427419 186
db file parallel write .089073438 64
Copyright 2006 Kyle Hailey
Object Translation
Current fields in v$active_session_history
CURRENT_OBJ#
CURRENT_FILE#
CURRENT_BLOCK#
Called “ROW_WAIT_%” in v$session
Only apply to
Buffer Busy Waits
IO Waits
Enqueue TX
Ignore these fields for other wait events
Wait interface Weaknesses
Logons
EM 10g shows these on perf page
Time model helps
V$SYS_TIME_MODEL
connection management call elapsed time
I’ve had problems
Paging/Memory issues
CPU starvation
Null Events
Bugs – read external table reports CPU
http://blog.tanelpoder.com/
Copyright 2006 Kyle Hailey
Dependable Tuning Strategy
Run Statspack/AWR report
Top 5 Timed Events
~50 lines down from top
Need Available CPU
Elapsed Time
CPU_COUNT
OEM 10g
Performance Page does everything !
OEM doesn’t solve the problem
Query v$active_session_history directly
Copyright 2006 Kyle Hailey
Summary
Waits make Tuning Easy
Check Machine Health
Tune Waits
Tune CPU
Tune SQL
Change Application Architecture
Use
OEM10g
Statspack/AWR,
S/ASH
Ignore Background, Idle, Resmgr, PQO
Use ASH if OEM fails
See http://perfvision.com for more info
Copyright 2006 Kyle Hailey