Home » Performance » MySQL » Query Performance Advice
| Query Performance Advice [message #910] |
Fri, 09 March 2007 18:22  |
briscod Messages: 1 Registered: March 2007 |
Junior Member |
|
|
Hi,
I have the following query:
select messages.sid, messages.mid, messages.sender_addr, recipients.rcpt_addr, messages.subject,
date_format(from_unixtime(messages.timestamp),"%M %d, %Y %H:%i:%s") as sent_date, ip_system.hostname
from messages, recipients,ip_system
where messages.mid = recipients.mid
AND messages.sid = recipients.sid
AND messages.sid = ip_system.sid
AND sender_domain = "sendingcompany.com"
AND subject like "Some Subject%"
AND recipients.rcpt_domain = "receivingcompany.com"
AND messages.timestamp >= unix_timestamp("2007-03-08 20:00:00")
group by messages.mid
order by sent_date LIMIT 0,10
Using InnoDB tables on a Windows 2003 Server (not my choice) with 3.5GB of RAM. Dual Xeon 3.6GHz.
messages table has about 45M rows
recipients table has about 70M rows
ip_system table has 4 rows (2 are logically deleted)
Explain Extended:
1,SIMPLE,messages,range,PRIMARY,msg_complete_idx,msg_sender_ domain_idx,msg_timestamp_idx,msg_sid_domain_timestamp_idx,ms g_sender_domain_idx,14,,430,Using where; Using temporary; Using filesort
1,SIMPLE,recipients,ref,PRIMARY,rcpt_domain_idx,rcpt_sid_mid _rid_domain,rcpt_sid_mid_rid_domain,6,message_tracker.messag es.sid,message_tracker.messages.mid,1,Using where
1,SIMPLE,ip_system,ALL,PRIMARY,,,,4,Using where
Typically the query uses the index (sender_domain[8],timestamp) on the messages table. sender_domain is unique for these set of companies in 8 characters or less.
Query times vary based on timeframe (the longer the worse) specified and the message volume of "sendingcompany.com". But can easily take 10-15 minutes. For small volume/timeframes it will be less than a minute.
I am thinking the group by on the computed field could be changed to the source column.
Can a sortfile be prevented by using both columns in the index? i.e. sender_domain, timestamp?
The primary index on the recipients table is mid,sid,rid. This should facilitate the joining of rows from the messages table.
I believe the ip_system table isn't a big player since it logically has only 2 rows.
INNODB STATUS OUTPUT
Per second averages calculated from the last 0 seconds
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 566606, signal count 564804
Mutex spin waits 15390019, rounds 21880070, OS waits 217206
RW-shared spins 438734, OS waits 182672; RW-excl spins 448303, OS waits 125671
------------
TRANSACTIONS
------------
<snip>
--------
FILE I/O
--------
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: wait Windows aio (read thread)
I/O thread 3 state: wait Windows aio (write thread)
Pending normal aio reads: 0, aio writes: 0,
ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0
Pending flushes (fsync) log: 0; buffer pool: 0
9601252 OS file reads, 8643216 OS file writes, 371528 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf for space 0: size 1, free list len 74, seg size 76, is empty
Ibuf for space 0: size 1, free list len 74, seg size 76,
5461531 inserts, 5461531 merged recs, 845038 merges
Hash table size 6607487, used cells 946582, node heap has 1003 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 404 3161598899
Log flushed up to 404 3161598899
Last checkpoint at 404 3161598899
0 pending log writes, 0 pending chkp writes
153464 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 1771661386; in additional pool allocated 490240
Buffer pool size 101888
Free buffers 0
Database pages 100885
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages read 9601242, created 2974457, written 8324174
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread id 2760, state: waiting for server activity
Number of rows inserted 201389704, updated 426, deleted 113685, read 314844469
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
Any help advice would be appreciated.
David
|
|
|
| Re: Query Performance Advice [message #931 is a reply to message #910 ] |
Thu, 15 March 2007 10:54  |
sterin Messages: 324 Registered: March 2007 Location: Sweden |
Senior Member |
|
|
Yes the sort on the date is bad from a performance point of view.
Because of this sort the DB needs to first convert each date to the new format and then it has to compare them.
Imagine if you have 2,000,000 messages that match your timeframe/company name and each date has to be converted and then you have to sort these 2,000,000 rows.
But you have a lot of inconcistency in your query:
1.
You group on message.mid but you do _not_ use group functions in the SELECT part.
Which means that you don't know which values you are going to get.
So which values do you want to get?
2.
I also find the sort a bit strange since the output you are going to get is:
April 4
December 2
January 30
Which means that it is not sorted in date order but in string order which I feel is very strange when I look at a list of dates.
If you drop this sort we can start to tweak your query.
|
|
|
Goto Forum:
Current Time: Thu Jul 9 23:39:13 EDT 2009
Total time taken to generate the page: 0.00545 seconds |