You are here
MySQL Cluster - Cluster circular replication with 2 replication channels
A few days ago I had to deal with MySQL Cluster replication. I did not do this for a while so I was prepared to expect some interesting surprises once again.
For those who MySQL Cluster - Cluster circular replication is the daily business they may skip this article. For all the others they possibly can profit from our learnings.
I am talking about the following MySQL Cluster set-up:
More detailed information about such set-ups you can find in the MySQL Cluster documentation.
Situations that lead to a channel fail-over
What are the problems with MySQL Cluster that lead to a channel fail-over:
- MySQL master can loose connection to the MySQL Cluster (lost_event, gap).
- MySQL master cannot catch up with the load of the MySQL Cluster and thus gets a lost event (gap). I am really wondering about the argument in the documentation [ 1 ], that in such a case we can do a channel fail-over. Because when Master 1 cannot catch up with the load, why should then Master 2 be capable to do it...?).
What a gap (
lost_event) is you can see in the following sketch:
How to set-up such a system I do not discuss here. You can find the instructions for this in the MySQL Cluster documentation.
Our findings were the following:
Duplicate primary key entries
We got some duplicate key entries for our
AUTO_INCREMENT Primary Keys:
ERROR 1062 (23000): Duplicate entry '260' for key 'PRIMARY'
The reason for this is that we were setting
ndb_autoincrement_prefetch_sz to 256 for better
The consequence of this is, when you insert data in all
mysqld's of both clusters at the same time you get earlier or later a Primary Key conflict.
We solved this problem by setting different
auto_increment_offset values for the SQL nodes on both MySQL Clusters.
Recommend channel fail-over procedure only works with traffic
A further issue we discovered already long time ago with an other customer is that the recommended channel fail-over procedure [ 2 ]:
slave1> STOP SLAVE; slave1> SELECT MAX(epoch) AS latest FROM mysql.ndb_apply_status; master2> SELECT SUBSTRING_INDEX(File, '/', -1) AS master_log_file , Position AS master_log_pos FROM mysql.ndb_binlog_index WHERE epoch > <latest> ORDER BY epoch ASC LIMIT 1; slave2> CHANGE MASTER TO master_log_file='<master_log_file>', master_log_pos=<master_log_pos>;
only works when you have traffic on the MySQL Cluster. If you have no traffic this query returns an empty result set and you have to rely on the
SHOW MASTER STATUS command instead.
Documentation bug about
This is always true for the channels ch3 and ch4 from the Cluster B to Cluster A when you have NO traffic on Cluster B and you follow the documentation where it states for such a set-up
log_slave_updates MUST NOT be enabled [ 3 ].
I discussed this with a few people and possibly here the documentation seems to be wrong. The Masters for a 2 Cluster replication can have
log_slave_updates enabled and IMHO the masters of a 3 Cluster Replication MUST have the
But still, when you have no traffic on both Clusters you have to change the method to determine the correct binary log position when you want to switch the traffic. This makes it more difficult when you want to automatize or script channel fail-over.
Earlier, when I remember correctly, MySQL Cluster has always written empty epochs to the binary log. So this would guarantee that you have always some traffic on the channels. Then I complained and the MySQL Cluster developers have fixed this behaviour. But for the current case this behaviour would make sense. So I was looking for the
ndb-log-empty-epochs variable and hoped that it would enable this behaviour again. But some how it does not report some epochs in the binary log in my set-up. At least not in the short time I was looking at it.
Log_slave_updates is also used with the binlog injector thread
A further learning with the
log_slave_updates was, that according to the documentation, this parameter comes only into play, when a slave also acts as master [ 4 ]. What means that he only writes the data into its binary log when he directly retrieves the data from its master. It seems like this was a wrong assumption. A master seems to write also the statements coming through the cluster via the binlog injector thread [ 5 ]. This possibly an other Documentation bug or at least not completely documented.
Skip_slave_start should be used
A further pitfall was that I forgot to set the
skip-slave-start variable. This IMHO should be set always on the slave in such a set-up. When a slave starts he cannot know if he should be the active channel or not.
- Monitor your replication channels.
- Consider using
auto_increment_offsetwhen writing to both MySQL Clusters or avoid
- Have a clever channel fail-over script.
log_slave_updateson all the masters.
skip_slave_starton all the slaves.
These were the findings and learnings of my last engagement with MySQL Cluster replication and fail-over replication channels. If you have some more or other experience I would be happy to hear about.
When I find some time we should write a script to automatically fail-over channels in such a set-up.