Home » Performance » MySQL » Help with really big tables
Help with really big tables [message #1515] Wed, 04 July 2007 21:38 Go to next message
l3vi  is currently offline l3vi
Messages: 2
Registered: July 2007
Junior Member
I have this project where we store keywords by the hour. Then at the end of the month we sum up the data and move it over to another table for research.

In the past we just stored the keyword, and the datetime in a table labeled after the month that it was in, then just rolled them keywords and summed count over to another table for monthly research.

The problem is that the keywords where being used as the keys for cross referencing them to other data that was collected, and this as we all know is not a good thing.

While the first build works really well for what we had in mind, its just not going to cut it where we are trying to head with it.

At this point we are looking at around 4 billion unique keywords per month. So I thought about using innodb for the daily data, then using another table with each keyword. That table would hold a years worth of unique keywords(all time) worth of data to assign keys for each keyword.

The problem with this is that the system would be running a lot of inserts into the monthly table with the key, count, and date, however with the central table holding the keyword and key it would mean that the system and the users would all be touching that table and thus create a bottleneck in the system as millions of selects/inserts where run every minute on the one table with over 400 billion keywords, include fulltext searches.

I have looked at using innob, Sphinx, Partitions, and Im just left stumped at what would be the best way to provide very fast search for users on the monthly totals while at the same time reduce the load from the keywords being used as their own keys.

In a nutshell I have a data warehouse project, that Im trying to find a way to keep everyone in line with their keys without causing a bottleneck from the all the queries, updates, and inserts on the central table that would keep all the keys aligned for joins.

So any ideas on the best way to store 400 billion keywords for search, with a key for reference, and still give very fast results?
Re: Help with really big tables [message #1516 is a reply to message #1515 ] Wed, 04 July 2007 22:13 Go to previous message
l3vi  is currently offline l3vi
Messages: 2
Registered: July 2007
Junior Member
Here are my new tables. I would think at 400 billion rows in keyword_lexicon it would become a heaping mess with some very very slow queries and the users expect to be able to do complex search as if they are on Google. ;p

CREATE TABLE `keyword_Count` (
`ID` int(15) NOT NULL default '0',
`Count` int(15) NOT NULL default '0',
`Date` datetime NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`ID`,`Date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --------------------------------------------------------

--
-- Table structure for table `keyword_Lexicon`
--

CREATE TABLE `keyword_Lexicon` (
`ID` int(15) NOT NULL auto_increment,
`Keyword` varchar(80) collate utf8_unicode_ci NOT NULL default '',
`Status` enum('-1','0','1','2','3','4') collate utf8_unicode_ci NOT NULL default '0',
PRIMARY KEY (`ID`),
UNIQUE KEY `Keyword` (`Keyword`),
FULLTEXT KEY `Keyword_2` (`Keyword`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- --------------------------------------------------------
--
-- Table structure for table `keyword_Monthly`
--
CREATE TABLE `keyword_Monthly` (
`ID` int(15) NOT NULL default '0',
`Count` int(15) NOT NULL default '0',
`MonthDate` date NOT NULL default '0000-00-00',
PRIMARY KEY (`ID`,`MonthDate`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

[Updated on: Wed, 04 July 2007 22:14]

Previous Topic:Optimal DB structure for a forum system
Next Topic:Multiple-column index not working
Goto Forum:
  


Current Time: Mon Jul 6 21:57:51 EDT 2009

Total time taken to generate the page: 0.05239 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 2.7.7.
Copyright ©2001-2007 FUD Forum Bulletin Board Software

MySQL is a trademark of Sun Microsystems.
InnoDB is a trademark of Oracle Corp.

Percona Performance Forums are a service of Percona, Inc.
Not affiliated with Sun Microsystems or Oracle Corp.