Home » Performance » MySQL » How many times are my indexes used?
| How many times are my indexes used? [message #365] |
Sun, 19 November 2006 10:40  |
Hasgaroth Messages: 7 Registered: November 2006 Location: UK |
Junior Member |
|
|
Hi All,
I manage quite a large system of mySQL MYISAM databases, roughly 400 in total, each with about 100 tables - the total size of the db system is over 800Gb, with over 8 billion rows in total (including summary tables).
All the databases are in one of two formats, each with a set of tables storing a particular segment of the data.
All the tables have index built on them, but what I am wondering is how many of the indexes are actually used, and how often.
Our main concern at the moment is disk space, and making the best use of it. I would like to locate and remove any indexes that are not actually used in the querying of the data.
Most of the table structures are using the correct field definitions, tinyint and smallint instead of int where appropriate, etc, and the rest are being worked on to reduce the data file sizes.
I can find out when an index was last accessed using the timestamp on the filesystem (Linux), but what I am after is when it was last used in a query?
Knowing when it was last updated is not really relevant, as our data is updated frequently.
Hope someone out there can help!
Thanks in advance,
Hasgaroth.
|
|
|
| Re: How many times are my indexes used? [message #367 is a reply to message #365 ] |
Mon, 20 November 2006 11:46   |
Peter Messages: 405 Registered: August 2006 |
Senior Member Super Guru |
|
|
Hi,
Yes this is hard nut to crack, especially as indexes are stored in single file for MyISAM tables which means you would not see which of indexes is used if there are any.
The way you can find unused indexes is to have general query log, run EXPLAIN on each select query in it and analyze "index" column in output to see which indexes were used by this set of queries.
This leaves out indexes used by only update/delete queries but is still pretty good in most cases.
You also can count queries and ie see how many queries used one or another index.
Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
|
|
| | | |
| Re: How many times are my indexes used? [message #388 is a reply to message #379 ] |
Wed, 22 November 2006 07:04   |
Hasgaroth Messages: 7 Registered: November 2006 Location: UK |
Junior Member |
|
|
Here's the code and query log data structure...
DROP TABLE IF EXISTS `QUERY_INFO`;
CREATE TABLE IF NOT EXISTS `QUERY_INFO` (
`id` int(11) NOT NULL auto_increment,
`SiteCode` varchar(8) NOT NULL default '',
`UserID` varchar(30) NOT NULL default '',
`QueryString` text NOT NULL,
`Duration` float default NULL,
`DTS` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `DTS` (`DTS`),
KEY `Duration` (`Duration`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
This allows us to store the information about any query we run on our databases, in reality, we only commit queries to this log that we are interested in, not all queries (this is done by passing a flag through to our query routine).
The code to analyse this table and run the 'EXPLAIN' queries against each query is as follows;
<?php
################################################################################
# index_check.php script
# ------------------------------------------------------------------------------
# Analyses query log file to see how many indexes were used during the query
################################################################################
$host =$argv[1];
$uname =$argv[2];
$password =$argv[3];
$database =$argv[4];
################################################################################
# Connect to the database
$mycon=mysql_connect($host, $uname, $password);
$mydb=mysql_select_db($database);
if ( ! $mydb ) {
print "No database found of that name";
die();
}
# Print out a header row for the analysis data
print "Table Name,Index,Used,QueryNo\n";
################################################################################
# pull the queries we want to analyse
$strSQL = "SELECT id, QueryString
FROM QUERY_INFO
WHERE QueryString REGEXP '^select'
";
$qryResult = mysql_query($strSQL);
while ( $qryRow = mysql_fetch_array($qryResult) ) {
# Now run the select query, adding EXPLAIN to the beginning...
$explSQL = "EXPLAIN " . $qryRow["QueryString"];
$explResult = @mysql_query($explSQL);
if ( $explResult ) {
while ( $explRow = mysql_fetch_array($explResult) ) {
if ( $explRow["possible_keys"] != "" ) {
# Extract a list of the possible indexes in this table
$arrKeys=explode(",", $explRow["possible_keys"]);
# Now print out a row for each index,
# Stating whether the index was used or not...
foreach ( $arrKeys as $key => $value ) {
print $explRow["table"] . ",";
print $value . ",";
print ( ( $explRow["key"] == $value ) ? "1" : "0" ) . ",";
print $qryRow["id"] . "\n";
}
}
}
}
}
mysql_close();
?>
This only analyses those queries which start with a 'SELECT' statement, a number of ours create temp tables so I'm having to make some changes to look at these types of queries as well, it should be easy, it just means stripping the 'CREATE TABLE blahblah' from the beginning of the query.
The output of the script is as follows;
Table Name,Index,Used,QueryNo
thd,Month,0,1
TH,PRIMARY,0,7987
TH,TH_Unique,1,7987
thd,TH_id,1,7987
thd,Month,0,7987
TH,PRIMARY,0,7988
TH,TH_Unique,1,7988
thd,TH_id,1,7988
thd,Month,0,7988
TH,PRIMARY,0,12663
TH,TH_Unique,0,12663
thd,TH_id,1,12663
thd,Month,0,12663
TH,PRIMARY,0,12664
TH,TH_Unique,0,12664
thd,TH_id,1,12664
thd,Month,0,12664
TH,PRIMARY,0,12667
TH,TH_Unique,0,12667
thd,Data_Unique,1,12667
thd,Month,0,12667
This gives you the table name (unfortunately only the table alias if used - which doesn't help with sub-queries! I'll work on extracting that part if I get time!), followed by the index, whether the index was used, and what row in the query table was being analysed.
This csv can then be analysed in Excel (or other), to sum the third column, filtering on the others if required to find out which indexes have a zero (or low) usage count.
Hope this helps anyone, it has already identified a number of indexes that aren't being used, or are being used very infrequently.
Hasgaroth.
|
|
| | | | |
Goto Forum:
Current Time: Mon Jul 6 21:20:26 EDT 2009
Total time taken to generate the page: 0.01365 seconds |