Home » Performance » MySQL » Creating The right database for hundreds of thousands of Blog Posts?
Creating The right database for hundreds of thousands of Blog Posts? [message #385] Tue, 21 November 2006 16:40 Go to next message
evan108108  is currently offline evan108108
Messages: 2
Registered: November 2006
Location: NY City
Junior Member

Let me say straight off that I do not have to much control over the
MySQL server itself, but I can say that it has fairly good capacity
and is running MySQL 5.0.19 and is on a Linux box. You can assume a
default setup.

I have a good amount of experience with MySQL, however MySQL
optimization is not one of my better skill sets. I currently have a
site (really a series of them) which employs a small amount of MySQL,
but not for the main storage engine. I need to shift to a full MySQL
solution. It's really not a very complicated site, however there is
lots of data.

The site works essentially like a blog. In fact its a small blog
aggregator. There are posts with dates, the source the post came from
as well as a link to the original article. So what has to be stored is
relatively simple. That said there are currently about 200,000 posts
and counting (uncompressed over 1 gig of data). This could easily grow to over 1 million.

The site is split into sub domains. Each sub
domain relates to a specific topic. For example:
There is a celebrity section and the sub domain is celebrity.MySite.net.
there is a video game section and the sub domain is game.MySite.net
etc...

Currently there are 19 of these sub domains in all, but it's growing.

MySite uses two very small tables at present. The rest of the data is
stored in gziped text files (that's the part that has to change).
These tables are as fallows.

1. sub domain ( stores info about a particular sub domain )

CREATE TABLE `subdomain` (
`sd_id` int(10) unsigned NOT NULL auto_increment,
`sd_name` varchar(255) NOT NULL default '',
`sd_name_disp` varchar(255) NOT NULL default '',
`sd_title` varchar(255) NOT NULL default '',
`sd_desc` varchar(255) NOT NULL default '',
`sd_keywords` varchar(255) NOT NULL default '',
`sd_feed_alt_url` varchar(255) NOT NULL default 'NONE',
`sd_today_post_num` int(10) unsigned NOT NULL default '0',
`sd_other` varchar(255) default NULL,
PRIMARY KEY (`sd_id`)
) TYPE=MyISAM



2. feeds ( stores RSS feed URL's of each of the data sources. Each row
is related to the sub domain it belongs to via the "sd_id" field.

CREATE TABLE `feeds` (
`feed_id` int(10) unsigned NOT NULL auto_increment,
`sd_id` int(10) unsigned NOT NULL default '0',
`feed_name` varchar(255) NOT NULL default '',
`feed_site_url` varchar(255) NOT NULL default '',
`feed_rss_url` varchar(255) NOT NULL default '',
`feed_post_length` varchar(255) NOT NULL default '',
`feed_type` enum('headline','main','both') NOT NULL default 'main',
`feed_other` varchar(255) default NULL,
PRIMARY KEY (`feed_id`)
) TYPE=MyISAM AUTO_INCREMENT=536 ;


So what I do currently is on an hourly bases (via a cron job) read
through each of the rows in the "feeds" table and store any new posts
in the appropriate folder for the corresponding sub domain. But what I
need to do, as I have said is store this data in a MySQL DB. It will
need to be searchable, quickly accessible via sub domain, and quickly
sorted by date time. I would also like to be able to have a global
search across all sub domains if possible.

Currently there is no way to search MySite and if speed is going to
be a problem as far as search I can live with that. The key is to have
quick date sorting so that I can call the latest posts for sub domain
"celebrity" in chronological order with out to much delay.
Re: Creating The right database for hundreds of thousands of Blog Posts? [message #428 is a reply to message #385 ] Mon, 27 November 2006 11:58 Go to previous messageGo to next message
Peter  is currently offline Peter
Messages: 405
Registered: August 2006
Senior Member
Super Guru
In your case you may start with one more table for posts with sub_domain_id column so you can easily lookup data for each subdomains.

If your domains overlap you would need extra table to do many to many mapping.

A lot depends on traffic you are expecting. Ie if it is huge I would think about different tables for subdomains so it is easier to split them to different servers in the future.

You do not get much answers because question is very generic - to plat schema you need to get good feeling of what queries you're going to execute etc.


Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
Re: Creating The right database for hundreds of thousands of Blog Posts? [message #430 is a reply to message #385 ] Mon, 27 November 2006 13:07 Go to previous messageGo to next message
evan108108  is currently offline evan108108
Messages: 2
Registered: November 2006
Location: NY City
Junior Member

Thanks for the advice.

I think I will split the sub domains into separate tables as you mentioned. So that leaves me with only a few possible  queries that I will execute.

1. SELECT * FROM posts_SubDomainName ORDER BY posts_date DESC LIMIT X
2. SELECT * FROM posts_SubDomainName WHERE post_id = x
3. SELECT * FROM posts_SubDomainName WHERE post_name = 'x'

4. Search the text of the posts for user entered keywords.

The only other thing I need to know now is how to set up the indexes. Based on these four basic query types, what is the best way to index this table(s).
Re: Creating The right database for hundreds of thousands of Blog Posts? [message #431 is a reply to message #430 ] Tue, 28 November 2006 05:27 Go to previous message
Peter  is currently offline Peter
Messages: 405
Registered: August 2006
Senior Member
Super Guru
So you'll need indexes on (post_date), (post_id), (post_name)
plus full text search index on columns you're going to search.

Full text search may get pretty slow if your table growth large but it is other story.


Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
Previous Topic:building new index on big tables
Next Topic:Innodb 'Free buffers' going to 0 and system crashing
Goto Forum:
  


Current Time: Thu Jul 9 21:43:08 EDT 2009

Total time taken to generate the page: 0.01130 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 2.7.7.
Copyright ©2001-2007 FUD Forum Bulletin Board Software

MySQL is a trademark of Sun Microsystems.
InnoDB is a trademark of Oracle Corp.

Percona Performance Forums are a service of Percona, Inc.
Not affiliated with Sun Microsystems or Oracle Corp.