MySQL EXPLAIN Explained
Quick and Easy Query Optimisation
Before we begin...
What you need to know
How and why we add indexes to tables The benefits of correct field typing Understanding of the ideals of 3NF Basic understanding of SQL JOINs
This presentation
Very quick introduction to EXPLAIN Improve understanding of MySQL and indexing Simplified examples / results
Introduction - Using MySQL EXPLAIN
Prefix a SELECT query with EXPLAIN
MySQL won't actually execute the query, just analyse it EXPLAIN helps us understand how and when MySQL will use indexes EXPLAIN returns a table of data from which you identify potential improvements Optimise queries in three ways
Modify or create indexes Modify query structure Modify data structure
Optimised queries = faster results, lower server load...
Introduction - Review of Indexing
Fast, compact structure for identifying row locations Keep indexes in memory by trimming the fat:
Can I reduce the characters in that VARCHAR index? Can I use a TINYINT instead of a BIGINT? Can I use an INTEGER to describe a status or flag (rather than a textual description)?
Chop down your result set as quickly as possible MySQL will only use one index per query/table it cannot combine two separate indexes to make a useful one *
Understanding and preparation brings about Indexing Strategy
* Not strictly true - look up Index Merge operations
Booking application schema
attendees
attendee_id INTEGER (PK) surname VARCHAR conference_id INTEGER (FK) registration_status TINYINT
conferences
conference_id INTEGER (PK) location_id INTEGER (FK) topic_id INTEGER (FK) date DATE
EXPLAIN Worked Example
EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0
table attendees possible_keys NULL key NULL rows 14052
The three most important columns returned by EXPLAIN 1)Possible keys
All the possible indexes which MySQL could have used Based on a series of very quick lookups and calculations
2)Chosen key 3)Rows scanned
Indication of effort required to identify your result set
EXPLAIN Worked Example
EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0
table attendees possible_keys NULL key NULL rows 14052
Interpreting the results
No suitable indexes for this query
MySQL had to do a full table scan
Full table scans are almost always the slowest query Full table scans, while not always bad, are usually an indication that an index is required
EXPLAIN Worked Example
ALTER TABLE ADD INDEX conf (conference_id); ALTER TABLE ADD INDEX reg (registration_status); EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 1;
table attendees possible_keys conf, reg key conf rows 331
MySQL had two indexes to choose from, but discarded reg reg isn't sufficiently unique
The spread of values can also be a factor (e.g when 99% of rows contain the same value)
Index uniqueness is called cardinality There is scope for some performance increase...
Lower server load, quicker response
EXPLAIN Worked Example
ALTER TABLE ADD INDEX reg_conf_index (registration_status, conference_id); EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 1;
table attendees
possible_keys reg, conf, reg_conf_index
key reg_conf_index
rows 204
reg_conf_index is a much better choice
Note that the other two keys are still available, just not as effective
Our query is now served well by the new index
EXPLAIN Worked Example
DELETE INDEX conf; DELETE INDEX reg; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123
table attendees
possible_keys NULL
key NULL
rows 14052
Without the conf index, we're back to square one The order in which fields were defined in a composite index affects whether it is available for use in a query Remember, we defined our index : (registration_status, conference_id)
Potential workaround: EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status >= -1
table attendees possible_keys reg_conf_index key reg_conf_index rows 204
EXPLAIN Example 2
EXPLAIN SELECT * FROM attendees WHERE surname LIKE 'har%';
table attendees possible_keys surname key surname rows 234
MySQL uses an index on surname which is good.
EXPLAIN SELECT * FROM attendees WHERE surname LIKE '%har%';
table attendees possible_keys NULL key NULL rows 14052
MySQL doesn't even try to use an index!
EXPLAIN Example 3
EXPLAIN SELECT * FROM conferences WHERE location_id = 2 OR topic_id IN (4,6,1)
table conferences possible_keys location_id, topic_id key NULL rows 5043
MySQL doesn't use an index, because of the OR
ALTER TABLE ADD INDEX location_topic (location_id, topic_id);
EXPLAIN SELECT * FROM conferences WHERE location_id = 2 OR topic_id IN (4,6,1)
table conferences possible_keys location_id, topic_id, location_topic key location_topic rows 15
Full table scan avoided could also use UNION (ALL) trick
EXPLAIN Example 4
EXPLAIN SELECT * FROM attendees WHERE MD5(conference_id) = MD5(123)
table attendees possible_keys NULL key NULL rows 14052
Understandably, MySQL has to do a full table scan
A more realistic example? EXPLAIN SELECT * FROM conferences WHERE DATE_FORMAT(date,'%a') = 'Sat'
table conferences possible_keys NULL key NULL rows 5043
A good candidate for Optimisation #3 Modify Data Structure
JOINs
JOINing together large data sets (>= 100,000) is really where EXPLAIN becomes useful Each JOIN in a query gets its own row in EXPLAIN
Make sure each JOIN condition is FAST
Make sure each joined table is getting to its result set as quickly as possible The benefits compound if each join requires less effort
JOINs Simple Example
EXPLAIN SELECT * FROM conferences INNER JOIN attendees USING (conference_id) WHERE conferences.location_id = 2 AND conferences.topic_id IN (4,6,1) AND attendees.registration_status > 1
table conferences attendees type ref ALL possible_keys key rows 15 14052
conference_topic conference_topic NULL NULL
Looks like I need an index on attendees.conference_id There are 13 different values for type Another indication of effort, aside from rows scanned Here, ALL is bad we should be aiming for ref Common values are const, ref, and all http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
The extra column
With every EXPLAIN, you get an extra column, which shows additional operations invoked to get your result set.
table attendees possible_keys conf key conf rows 331 extra Using where Using filesort
Some example extra values:
Using Using Using Using
where temporary table filesort index
There are many more extra values which are discussed in the MySQL manual.
Using filesort
Avoid, because: Doesn't use an index Involves a full scan of your result set Employs a generic (i.e. one size fits all) algorithm Uses the filesystem (eeek) Will get slower with more data
It's not all bad...
Perfectly acceptable provided you get to your result set as quickly as possible, and keep it predictably small Sometimes unavoidable - ORDER BY RAND() ORDER BY operations can use indexes to do the sorting!
Using filesort Example
EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 ORDER BY surname
table attendees possible_keys conference_id key conference_id rows 331 Extra Using filesort
MySQL is using an index, but it's sorting the results slowly ALTER TABLE attendees ADD INDEX conf_surname (conference_id, surname); EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 ORDER BY surname
table attendees possible_keys conference_id, conf_surname key conf_surname rows 331 Extra
We've avoided a filesort
Using index
Celebrate, because:
MySQL got your results just by consulting the index, Which could well have been sat in memory MySQL didn't need to even look at the table to get you your results Opening a table can be an expensive operation. MySQL can answer the next query more quickly The fastest way for you to get your data?
Particularly useful...
When you're just interested in a single date or an id Or the COUNT(), SUM(), AVG() etc. of a field
Using index Example
EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123
table attendees possible_keys conference_id key conference_id rows 331 Extra
Nothing is actually wrong with this query it could just be quicker! ALTER TABLE attendees ADD INDEX conf_age (conference_id, age); EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123
table attendees possible_keys conference_id, conf_surname key conf_surname rows 331 Extra Using index
Outside of caching, the fastest way to get your data *
*Not a guarantee
Moving forward...
Just because your queries are fast now, doesn't mean that they will stay that way forever Enable MySQL's Slow Query Log --log-slow-queries=/var/lib/mysql/slow-query.log Defaults to logging queries which take more than 10 seconds --long_query_time=1 Use Percona's microslow patch for values < 1 second Find the query in the log, EXPLAIN it, improve it, rinse and repeat
Moving forward...
Use the command line to identify more general problems mysqladmin -u dbuser -p -r -i 10 extended-status Figures are relative, updated every 10 seconds Slow_queries = number of slow queries in last period Select_Scan = full table scans Select_full_join = full scans to complete join operations Created_tmp_disk_tables = filesorts Key_read_requests/Key_write_requests Determine write/read weighting of our application and alter your indexes accordingly
MySQL Resources
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html High Performance MySQL - Baron Schwartz
ISBN 0596101716 20 (Money well spent)
http://www.mysqlperformanceblog.com Regular posts