ASM Internals
ASM Internals
Nitin Vengurlekar
Server Technologies
1
ASM Internals
y Architectural Overview
y Metadata Overview
y ASM-DB Instance relationship and
communication
y ASM-ASM Instance relationship and
communication
y ASM Modules
y ASM Disk Discovery
y ASM Diagnostics
2
ASM Architecture Overview
3
Architectural Overview
Database ASM
DB
Alert Log
Instance Instance ASM
Alert Log
Traces Traces
OS Logs
OS/Vendor Other
Traces
Layer
4
ASM Metadata
5
ASM Metadata
There are two types of ASM metadata
• On Disk structures
• In Memory structures
6
On Disk Metadata structures
7
Disk Metadata Structures
y All ASM metadata is broken down into blocks
8
Disk Metadata Structures
• Two types of On disk Metadata structures
• Physically addressed metadata –only relevant to
a single disk and thus is addressed directly by
physical block number relative to the disk
number.
9
Disk Metadata Structures
y Currently 17 block types:
– Disk Header
– Partnership and Status Table
– Allocation Table
– Free Space Table
– File Directory
– Disk Directory
– Active Change Directory
– Continuing Operation Directory
– Template Directory
– Alias Directory
– Indirect Block
y Not all blocks are present on every disk
10
Physically addressed
metadata
y ASM disk header
y Partner Status Table
y Space Allocation Table
y Free Space Table
11
Disk Header
y Written to block 0 of every ASM disk
y Recognized by discovery
y Contains attributes of the disk group
– Disk group name and creation timestamp
– Allocation Unit size and Metadata block size
– Redundancy type (external, normal, or high)
– Timestamp of last mount
y Describes the disk
– ASM disk name and number
– Failure group name
– Size of disk in allocation units
y Root extent pointer if present
12
On disk representation of 0 1 2 3
Physically addressed
ASM metadata.
13
Partner Status Table (PST)
y Maintains membership of disks in disk group
and status of the disk (online/offline)
y Contains vital information for crash recovery
y Replicated on n number of disks for protection
y When a disk group is mounted the PST is read
right after discovery
– similar to a file system super block
14
PST Heartbeat
y ASM heartbeats PST every 3 seconds
y Heartbeat writes are atomic writes
y Heartbeat prevents two instances in different
clusters from mounting the same disk group
– Second instance will hit ORA-15003
y Heartbeat verified at disk group mount
15
PST Trouble Shooting
16
Allocation and Space Tables
17
Virtually addressed metadata
y File directory
y ASM disk directory
y Active Change Directory
y Continous Operations Directory
y Template Directory
y Alias Directory
y Indirect Block
18
Virtual extent
Normal high
redundancy redundancy
•Every ASM file and Dataabase stored on ASM diskgroups consists of 1 or more virtual extent.
•Each virtual extent can point to 1 or more data extent (DE). For normal redundancy each VE has
2 Des and for triple redundancy there are 3 Des.
•Each data extent is an allocation unit (1mb)
•Each data extent is generally made up of 256 data blocks
•Data blocks contain either database data blocks or ASM metadata.
19
ASM File Extent Terminology
y Virtual extent - an extent as seen by the RDBMS
client
– The size (not space) of a given file is assumed to
be N virtual extents in the following
y Physical extent number - slot within extent map when
presented as an unstructured array (e.g. no notion of
extent sets)
– 0 .. N-1 for an unprotected file
– 0 .. 2N-1 for a 2-way mirrored file
– 0 .. 3N-1 for a 3-way file
y Extent set number - 0 .. N
20
File directory
y Contains 1 entry per ASM file - this includes ASM
metadata files and database files.
y The File Directory is a self describing file
y The entry in the File directory points to the file data
extents.
y Each entry contains information about the file. This is
reflected in v$ASM_FILE
y Is triple mirrored with ASM normal & high redundancy
y Files 1 – 255 are reserved for internal use
y Files 256 and up are for database files
y The File directory is File #1
21
ASM File Allocation
y Splits files into pieces called extents
y Places extents onto disks at a position called
an “Allocation Unit” (AU) Disk # 4
File # 293 AU #15
Ext 0 AU # 407
Ext 1
Ext 2 Disk # 7
Ext 3
File # 4713 AU # 6
Disk # 13
Ext 0
Ext 1
Ext 2 AU # 11
Ext 3 AU # 43
22
ASM File Mapping
y Conceptually, mapping can be thought of as a
table with one row per relationship of File
Extent to Disk Allocation Unit
293 2 7 6
4713 0 4 15
4713 1 13 43
293 0 4 407
293 1 13 11
… …. … …
23
Extent Map & Allocation Table
y Extent Map – One per file
(indexed by Ext #) Disk # 7 Alloc Table
AU # File # Ext #
y Allocation Table – One per disk 0
…
-1
…
-1
…
(indexed by AU#) 6
….
293
…
2
…
24
X$KFFXP Extent Maps
25
X$KFDAT – Allocation Tables
One row per AU over all disks
y GROUP_KFDAT diskgroup # (1 - 63)
y NUMBER_KFDAT disk # for the AU
y COMPOUND_KFDAT (group_kfdat << 24) + number_kfdat
y AUNUM_KFDAT AU # within disk
y V_KFDAT Valid: Y if allocated, N if unused
y FNUM_KFDAT Meaningless unless V_KFDAT set
– File number using this AU
– 0 for physical/PST AUs
– -1 if past end of disk
y I_KFDAT Indirect: Meaningless unless V_KFDAT set
– Y if indirect extent using the AU
– N if direct extent using this AU
y XNUM_KFDAT Physical extent number using AU
y RAW_KFDAT Raw 8-byte contents of extent pointer
– useful when V_KFDAT not set to check for corruptions
26
ALTER DISKGROUP CHECK
27
ASM Disk directory
28
Active Change Directory
y It’s the redo log of ASM. Records atomic changes to
the metadata files
y Examples of ACD records –
– Add or drop – templates, files or aliases
– Allocate/de-allocate extents
y Each instance gets 42 segment of the ACD and is
circularly rewritten
y Most important portion of the segment is the
checkpoint record (ACDC). This is essentially the
heartbeat of the ASM instance.
y ACD is file # 3
29
Continous Operations
Directory
y Used for ASM instance recovery
y Logs long/large changes to the ASM structure
that cannot be done atomically.
y Changes can be initiated by database
instance or ASM instance
y Two types of operations – background &
rollback
30
Continous Operations
Directory
y background operations are invoked by the ASM
instance to perform diskgroup maintenance;
such as rebalance
y Rollback operations are instigated by the
database instance; examples include adding or
deleting files, templates, or aliases.
y In rollback operations the RBAL (db instance)
actually performs the operations.
y COD is file # 4
31
Template and Alias Directory
32
On Disk Metadata structures
33
ASM in-memory structures
34
ASM in-memory structures
35
ASM Communication
36
ASM-DB communication
37
Database Connect to ASM
38
Server
Database ASM
A. Open
y Database
file open
Operating System
39
Server
Database ASM
B. Read Map
A. Open
y Database
file open
Operating System
40
Server
Database ASM
B. Read Map
A. Open
y Database
file open C. Extent Map
Operating System
41
Server
Database ASM
B. Read Map
A. Open
y Database
D. File I/O
file open C. Extent Map
Operating System
42
File Creation
y Database process connects directly to ASM instance
y Database requests file creation and blocks on reply
y ASM foreground creates COD entry and allocates
space for new file across all disks
y ASMB receives extent map for new file
y Database request returns with file open
y Database process initializes file contents
y Database process requests commit of file create
y ASM foreground clears COD and marks file created
y Database process logs out of ASM
43
Server
Database ASM
A. Create
y Database
file
creation
Operating System
44
Server
Database ASM
A. Create
B. Allocation
y Database
file
creation
Operating System
45
Server
Database ASM
A. Create
B. Allocation
y Database
file C. Extent Map
creation
Operating System
46
Server
Database ASM
A. Create
B. Allocation
D. Initialize
y Database
file C. Extent Map
creation
Operating System
47
Server
Database ASM
A. Create
B. Allocation
D. Initialize
y Database
file C. Extent Map
creation E. Commit
Operating System
48
How do DB instances talk to
ASM instances?
y DB uses KSV slave pool
to communicate with
Database Instance
ASM Instance ASM
y ASMB always listens for
messages from ASM
ASMB
SFG instance
y Foregrounds(FG) in ASM
ASM Processes
Db processes
49
R D B M S In s ta n c e A S M In s ta n c e
C o n n e c te d D B
In s ta n c e
M a s te r P r o c e s s e s B a ck g ro u n d P ro ce s se s
P M O N R B A L P M O N R B A L
C K P T . . . C K P T . . .
U m b ilic u s
D B F G
A S M B U F G
C lie n t B g S la v e s / S e r v e r F g P r o c s
S Q LP LU S
O 0 00 N F G
S G A p ip e
O 0 01 N F G
. .
. .
. .
N F G
U m b ilic u s
A S M B U F G
P s e u d o -C lie n t B g S la v e s / S e r v e r F g P r o c s
O 0 00 N F G
D B S ta rtu p
u s in g s p f ile o n
A S M d is k g r o u p
H e a d m a s te r P ro c e s s e s M a s te r P ro c e s s e s
S Q LP LU S D B F G A S M F G
L e g e n d
C li e n t u t il i t y N e tw o r k c o m m u n i c a tio n A S M F G – A S M F o re g ro u n d
D B F G – D a ta b a s e F o r e g r o u n d
N F G – N e tw o r k F o r e g r o u n d
F ore grou n d pro c e s s M e s s a g e c o m m u n i c a ti o n U F G – U m b i l ic u s F o r e g r o u n d
B a c k gro un d pro c e s s S G A c o m m u n i c a ti o n
S Q L P L U S / fo r e g r o u n d 50
c o m m u n ic a tio n
ASM-ASM communication
51
ASM-ASM communication
52
ASM recovery
53
ASM Modules
54
Module layout
RDBMS I/O
REQUEST SQL PL/SQL
KF* KF*
KFN SQLNET
KFK
KFN
SKGFR SKGFR
KFK
ASMLIB
ASMLIB ODMLIB
O/S
DISKS
DISKS
55
ASM Modules
y KFK – Disk I/O
56
ASM Modules
y KFD – Disks (physical storage)
57
ASM Modules
y KFG – Disk Groups management.
Add/drop and messaging.
58
ASM Modules
y KFN – SQL Net
– Provides connection between ASM and DB
instances
– Runs in client side and slave side
59
ASM Modules
y KFX – Interfaces between the parser,
the driver and ASM layer.
60
ASM Modules
KFB – Block validation
61
ASM Modules
y KFOD – ASM disk discovery utility.
$ kfod disks=all
Disk Size Path
=================================
1: 8526 Mb /dev/raw/raw1
2: 8526 Mb /dev/raw/raw3
3: 8681 Mb /dev/raw/raw4
62
ASM Modules – outside of
ASM
Modules outside of ASM
y General support services -enqueues, latches, state
objects, etc.
y Multi instance locking and cache fusion locks-used by
KFC to coordinate metadata changes.
y Opiexe - parser entry points - calls ASM modules to
execute ASM SQL commands.
y KSFD - Disk IO interface. Calls KFIO to access ASM
files and called by KFK to access dumb disks
63
Disk discovery in ASM
64
Discovery Basics
65
ASM Diskstring
– /dev/rdsk*
– /dev/raw/raw[2-9]
66
ASM Diskstring
y Set of disks
– For ASM instance
All attached disks
– For any operation
ASM diskgroup 1
ASM disks
Application/System disks
67
ASM_DISKSTRING Parameter
y OS specific default
– /dev/raw/raw* on Linux
– /dev/rdsk/* on Solaris
– /dev/rdsk/* on HP
– /dev/rhdisk/* on AIX
– \\.\ORCLDISK* on Windows
68
ASM Diskstring
asm_diskstring=/dev/raw/raw[2-9];
create diskgroup d disk ‘/dev/raw/raw[8-9]’;
/dev/raw/raw[89]
/dev/raw/raw[67]
/dev/raw/raw[45]
/dev/raw/raw[23]
/dev/raw/raw[01]
69
ASM Diskstring
asm_diskstring=/dev/raw/raw[2-9];
alter diskgroup d add disk ‘/dev/raw/raw[4-7]’;
/dev/raw/raw[89]
/dev/raw/raw[67]
/dev/raw/raw[45]
/dev/raw/raw[23]
/dev/raw/raw[01]
70
ASM Diskstring
asm_diskstring=/dev/raw/raw[2-9];
alter diskgroup d add disk ‘/dev/raw/raw0’;
ERROR at line 1:
ORA-15032: not all alterations
performed
ORA-15031: disk specification
'/dev/raw/raw0' matches no disks
ORA-15014: location ‘/dev/raw/raw0’
is not in the discovery set
71
Common Discovery Problems
72
Overlapping Partitions
Partition 2
73
OS Disk Label Missing
y Solaris requires a disk label for valid
device access
– Visible on slice 2 c0t0d0s2 (typically)
– ASM should use partition with the entire
disk except the header (typically slice 6)
75
ASMLib Overview
76
What is Oracle’s ASMLIB
Server
Provides:
Oracle DB/ASM
• Device discovery
• Batch I/O scheduling
User Mode ASM API
(Implementation)
Kernel Mode
O/S
Components:
Device Drivers
• Library
• Device Driver
• Utilities
Storage
77
Further ASMLIB help
y http://otn.oracle.com/tech/linux/asmlib
y http://asm.us.oracle.com
– Select ASM tab (on top right from RAC
page)
78
ASM Diagnostics
79
What are Typical Problems?
80
What are Typical Problems?
81
Where to Look for Clues?
82
Where to Look for Clues?(2)
83
Monitoring ASM Memory Usage
y v$sga: summary of SGA component groups
y v$sgainfo: breakdown of memory allocated to
individual SGA components
y v$sgastat: more detailed breakdown of memory
usage within each SGA component
y x$ksmlru: list of top shared pool allocations
which caused LRU activity since last query of this
view
y x$kghlu: summary data about shared pool LRU
84
Monitoring ASM Memory Usage
(Contd)
y Internal SGA overhead
– Shared memory used by internal oracle
components for structures, meta-data, etc.
– Allocated during instance startup
– Can be viewed in v$sgainfo
– Comes out of shared_pool_size, so
subtract this overhead to obtain the
effective pool size
85
v$sgastat
86
ASM Memory Usage (Contd)
87
Common Memory Errors
y Out-of-memory errors
– ORA-04031: out of SGA memory
– ORA-04030: out of PGA memory
y Error message contains
– size of failing allocation
– name of pool used for this allocation
– identifying string for the failing allocation
y Some out-of-memory errors in DB alert log
may need fix on the ASM side
– check the error stack
88
Viewing Heaps
y Why?
– Querying contents of a heap to observe usage
– Tracking size of heap over time
y SGA heap dumps can be disruptive for the
instance, because of latching
– Not advisable on large production instances
– Typical usage would be on a test instance
y How to view contents of heaps
– Using SQL
– Using alter session
– Using oradebug
89
Viewing Heaps - Using SQL
90
Viewing Heaps (Cont)
y Using alter session
– SQL> alter session set events ‘immediate trace name heapdump
level <n>’; where <n> corresponds to one of the following:
1: pga heap, 1025: pga heap w/ contents
2: sga heap, 2050: sga heap w/ contents
4: uga heap, 5000: uga heap w/ contents
8: current call heap, 8200: current call heap w/ contents
16: user call heap, 16400: user call heap w/ contents
32: large alloc heap, 32800: large alloc heap w/ contents
Using oradebug
– SQL> oradebug setmypid
– SQL> oradebug setospid <target_pid>
– SQL> oradebug dump heapdump <n>
91
ASM Debug tools
92
Cluster Synchronization
Services (CSS)
y On a single node
– CSS runs from an ORACLE_HOME (ie
ORA_CRS_HOME is an ORACLE_HOME)
– Provides synchronization between ASM and DB
instances on a node
y On a cluster
– CSS runs from the ORA_CRS_HOME
– Provides synchronization
y between ASM and DB instances on a node
y between ASM instances in a cluster
93
CSS Debugging
y CSS daemon logs in
$ORA_CRS_HOME/css/log
– ocssdN.log
– ocssdN.blg
y CSS daemon init output files in
$ORA_CRS_HOME/init
y CSS daemon startup files (if any), e.g.
/etc/init.d/init.cssd
y Stack trace of any relevant core file found in a
subdirectory of the $ORA_CRS_HOME/init
– Provide the stacks of all threads
94
CSS Debugging (cont)
y Oracle Cluster Repository (OCR)
– output of ‘ocrdump’
– Name output files based on node names
– Provide output from all nodes to rule out OCR
configuration errors
y Network configuration
– Varies by platform
– Solaris: 'ifconfig -a'
– Linux: 'ifconfig' command
95
Further ASM Resources
y http://asm.us.oracle.com
Select ASM tab (on top right from RAC page)
– ASM FAQ
– Links to training webcasts
– White papers
y Report asm bugs against product 5
component RDBMS sub component ASM
y Helpasm_ww email alias
96