Home » Performance » MySQL » Partial replication
Partial replication [message #8] Fri, 11 August 2006 07:56 Go to next message
radek  is currently offline radek
Messages: 7
Registered: August 2006
Location: London
Junior Member
Hi

I want to create a reporting server to move huge load from transaction server into it. The problem is that there is so many insert/updates in transaction database that simple replication of entire database won't give us much. I heard partly replication is possible so we could replicate only tables which are used by reports.

Here is big picture/use case of what I'm trying to acheive with mysql replication. Assume we have table "logs" and "summary". Every hour about 2GB of data is imported into "logs" table. Every hour that data is summarized into "summary" table using some kind of "INSERT INTO summary ... SELECT COUNT(*)... FROM logs WHERE ...". There are more tables like this one and they are summarized from many other tables using JOINS.
I'm wondering if it's possible to replicate only the "summary" table on slave server? How it works in that situation if we will ignore logs table? Will mysql replication just send new and updated rows inside "summary" table to slave? Or is it neccessary that "logs" table have to exists on slave in order to summarize the data from it on slave as well?

Thanks in advance for any tips

Radek

[Updated on: Fri, 11 August 2006 07:59]

Re: Partial replication [message #9 is a reply to message #8 ] Fri, 11 August 2006 08:09 Go to previous messageGo to next message
Peter  is currently offline Peter
Messages: 405
Registered: August 2006
Senior Member
Super Guru
Hi Radek,

MySQL Replication in MySQL 5.0 and below simply sends statements from master to the slaves (with exception of LOAD DATA INFILE) so if you do not replicate log table creation of summary table will fail on the slaves. Also it could be too expensive to run summary creation table on each of the slaves.

What you could do instead is do summary creation on the master to table which is not replicated and then do something like:

select * from tmp_summary into outfile '/tmp/dump.txt';

load data infile '/tmp/dump.txt' into table summary;

truncate table tmp_summary;

In such case summary will be updated just fine as you're using load date infile.

Note you will need to delete '/tmp/dump.txt' for this to work second time.


Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
Re: Partial replication [message #12 is a reply to message #9 ] Fri, 11 August 2006 11:57 Go to previous messageGo to next message
radek  is currently offline radek
Messages: 7
Registered: August 2006
Location: London
Junior Member
Hey Peter

Thanks for great idea! It sounds like it could be done. My only concern here is that it will involve changing existing maintenance process which summarize that data. I'm wondering if it would be possible to somehow force mysql to replicate that data after summarization is finished. Maybe by some kind of:

SELECT * FROM summary INTO OUTFILE '/tmp/dump.txt' WHERE date BETWEEN {maintenance_start_hour} AND {maintenance_end_hour};
LOAD DATA INFILE '/tmp/dump.txt' REPLACE INTO TABLE summary;
TRUNCATE TABLE tmp_summary;

*REPLACE is for situations when we need to rerun maintenance

I think in this way I will not have to even touch (refactor) existing summary process.

Do you know if there are any other options to do it? However it sounds for me that probably there is no better option.

Thanks!
Radek
Re: Partial replication [message #13 is a reply to message #12 ] Fri, 11 August 2006 12:03 Go to previous messageGo to next message
Peter  is currently offline Peter
Messages: 405
Registered: August 2006
Senior Member
Super Guru
Yes sure. This approach should also work. I just proposed example which should be tailored to your specific needs.


Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
icon3.gif  Re: Partial replication [message #81 is a reply to message #8 ] Fri, 18 August 2006 21:36 Go to previous messageGo to next message
dschneller  is currently offline dschneller
Messages: 2
Registered: August 2006
Location: Solingen, Germany
Junior Member
Maybe you could somehow use the blackhole filter engine solution I wrote about very recently. Just have a look at this:
http://jroller.com/page/dschneller?entry=mysql_replication_u sing_blackhole_engine

Daniel

Re: Partial replication [message #82 is a reply to message #81 ] Sat, 19 August 2006 09:55 Go to previous messageGo to next message
Peter  is currently offline Peter
Messages: 405
Registered: August 2006
Senior Member
Super Guru
Thanks for your comment,

Blackhole engine is helpful with replication but in different case- it does not help INSERT ... SELECT and similar cases when data is being processed as tables need to be on the same host as query is executing.

However blackhole engine is helpful in other cases as you've mentioned:

- Filtering of logs, especially if you have partial replication over WAN

- Mass distribution - you do not want to feed 50 slaves from loaded master directly.


Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
Re: Partial replication [message #83 is a reply to message #8 ] Sat, 19 August 2006 11:22 Go to previous messageGo to next message
dschneller  is currently offline dschneller
Messages: 2
Registered: August 2006
Location: Solingen, Germany
Junior Member
Oh, I see. I misread the first post. I thought the summarizing process was taking place on an application level. In that case partial replication of only the summary tables (INSERTs of summarized values, coming from the application) would have worked and the blackhole engine would have prevented all the raw data inserts from being transported over the wire.
Re: Partial replication [message #88 is a reply to message #83 ] Sun, 20 August 2006 18:00 Go to previous messageGo to next message
Peter  is currently offline Peter
Messages: 405
Registered: August 2006
Senior Member
Super Guru
Right,

With summarization done on application level that would be helpful.
On other hand I would say simple replicate-do-wild-table or replicate-ignore-wild-table is often enough - if you replicate in the local network and do not have too many servers or very high write load log shipping to the slave might cause minimal overhead.


Peter Zaitsev, MySQL Performance Expert
MySQL Performance Blog - http://www.mysqlperformanceblog.com
MySQL Consulting http://www.mysqlperformanceblog.com/mysql-consulting/
Re: Partial replication [message #320 is a reply to message #9 ] Thu, 02 November 2006 04:28 Go to previous message
lars  is currently offline lars
Messages: 1
Registered: November 2006
Location: Stockholm
Junior Member
Peter wrote on Fri, 11 August 2006 08:09

Hi Radek,

MySQL Replication in MySQL 5.0 and below simply sends statements from master to the slaves (with exception of LOAD DATA INFILE) so if you do not replicate log table creation of summary table will fail on the slaves. Also it could be too expensive to run summary creation table on each of the slaves.

What you could do instead is do summary creation on the master to table which is not replicated and then do something like:

select * from tmp_summary into outfile '/tmp/dump.txt';

load data infile '/tmp/dump.txt' into table summary;

truncate table tmp_summary;

In such case summary will be updated just fine as you're using load date infile.

Note you will need to delete '/tmp/dump.txt' for this to work second time.


And in MySQL 5.1, you can use row-based replication to filter (on slave) away everything execept the tables you are interested in.

Using row-based replication, the underlying log table is no longer needed on slave, since only data changes of the summary table is replicated instead of the SQL statements used to update the table.


--
Dr. Lars Thalmann
Replication and Clustering Technology
MySQL AB, www.mysql.com
Previous Topic:Intermittent Connection Failures and High aborted_connects
Next Topic:is using '' around number bad?
Goto Forum:
  


Current Time: Sun Jul 5 16:20:20 EDT 2009

Total time taken to generate the page: 0.00658 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 2.7.7.
Copyright ©2001-2007 FUD Forum Bulletin Board Software

MySQL is a trademark of Sun Microsystems.
InnoDB is a trademark of Oracle Corp.

Percona Performance Forums are a service of Percona, Inc.
Not affiliated with Sun Microsystems or Oracle Corp.