You are here

FromDual TechFeed (en)

The Upcoming Leap Second

Jörg Brühe - Mon, 2015-06-29 17:56

The press, be it the general daily newspaper or the computer magazines, is currently informing the public about an upcoming leap second, which will be taken in the night from June 30 to July 1 at 00:00:00 UTC. While we Europeans will enjoy our well-deserved sleep then, this will be at 5 PM (17:00) local time on June 30 for Califormia people, and during the morning of July 1 for people in China, Japan, Korea, or Australia. (Other countries not mentioned for the sake of brevity.) This is different from last time, when the leap second was taken in the night from Saturday to Sunday (2012-July-1 00:00:00 UTC), so it was a weekend everywhere on the globe.

We have got several requests from our customers about this upcoming leap second, whether they need to take any special precautions or whether they "are safe". Well, obviously nobody is "safe" from the leap second in the sense that it would circumvent them, everybody will encounter it on their systems. The concern is whether they have to expect any trouble.

For the people operating MySQL (or any other DBMS), this issue is threefold:

  • How will the operating system behave?
  • How will the database system behave?
  • How will the applications behave?
  • Let us look at the operating system first, dealing with Linux only. (All other operating systems don't show up with significant numbers in our customer base.)
    Linux measures the time in seconds (since Jan 1, 1970, 00:00:00 UTC), and it does not include the leap seconds in this counting. When the leap second is taken, the timestamp value (those seconds) will simply not be advanced at the end of the regular second, but it will re-use its current value in the leap second. As a result, the conversion of timestamps into common time reckoning will still produce values from 0 to 59 for the minute as well as for the second, there will not be a 60. Another consequence is that there is no way to tell the leap second from the preceding regular one.

    MySQL always takes the time information from the Linux kernel, so it will also use the same timestamp value (or "now()" result) for both the regular and the leap second. The MySQL manual describes this on its own leap second page, which has become inaccurate by some recent code changes: We could not reproduce the results given there (and will probably file a documentation bug about this). However, that difference does not affect the principle of the page's first paragraph.

    About the application, it is hard to claim anything - there are too many of them. However, it is obvious that an application might run into trouble if it managed to store a new timestamp value every second and assumed they are distinct: they will not be (unless the application manages to skip the leap second). Let's hope any application programmer creating such a high-resolution application was aware of the problem.

    So does it all look fine? Not completely: Following the last leap second (just three years ago), several administrators noticed that their machines became extraordinarily busy, which even let the power consumption rise significantly. This was caused by a bug in the Linux kernel, it let user processes resume immediately if they were trying to wait on a high-resolution timer. The exact details are beyond the scope of this article, your favourite search engine will provide you with more than enough references if you ask it.
    For our purposes, the important fact is that this also happened to MySQL server processes ("mysqld") because InnoDB was doing such operations.

    Back then, Sheeri K. Cabral (well-known in the MySQL blogosphere) published a cure which her team had discovered:
    This loop can be broken by simply setting the Linux kernel's clock to the current time.
    date -s "`date`"
    While this looks like a no-op at first sight, I assume that it will reset the sub-second time information and so will have a small effect, which obviously is sufficient to terminate that loop. (In practice, you should stop your NTP daemon before changing the date, and restart it afterwards. The exact command will depend on your Linux distribution.)

    Well, this was three years ago. The issue was reported to kernel developers, a fix was developed (which gives credit to Sheeri's report) and applied to newer kernels, and it w as also backported to older kernels. From my search, it seems that all reasonably maintained installations should have received that fix. For example, I have seen notices that Ubuntu 12.04 (since kernel 3.2.0-29.46) and 14.04 do have it, as does RedHat 6.4 (since kernel 2.6.32-298); for other distributions, I did not search.

    In addition to a current kernel, you obviously need current packages including the leap second information. Typically, this would be "ntp", "ntpdate", "tzdata", and related ones. Don't forget to restart your NTP daemon after installing these updates!

    So I wish all our readers that the leap second may not get them into trouble, or that (if worst comes to worst) the cure published by Sheeri may get them out. However, I have to remind you of the old truth which is attributed to Mark Twain (sometimes also to Niels Bohr):
    "It is difficult to make predictions, especially about the future."

    FromDual Backup Manager for MySQL 1.2.2 has been released

    Shinguz - Tue, 2015-06-23 11:33

    FromDual has the pleasure to announce the release of the new version 1.2.2 of the popular Backup Manager for MySQL and MariaDB (fromdual_bman).

    You can download the FromDual Backup Manager from here.

    In the inconceivable case that you find a bug in the Backup Manager please report it to our Bugtracker.

    Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

    Upgrade from 1.2.x to 1.2.2 # cd ${HOME}/product # tar xf /download/fromdual_brman-1.2.2.tar.gz # rm -f fromdual_brman # ln -s fromdual_brman-1.2.2 fromdual_brman
    Changes in FromDual Backup Manager 1.2.2 FromDual Backup Manager

    It contains mainly fixes with brman catalog and physical backups.

    You can verify your current FromDual Backup Manager version with the following command:

    fromdual_bman --version
    • Archiving with physical backup bug fixed.
    • Connect replaced by OO style and error exit fixed.
    • Create catalog fixed.
    • Archivedir without archive option does not make sense.

    Wir suchen Dich: MySQL/MariaDB DBA für FromDual Support

    Shinguz - Mon, 2015-06-22 13:51
    Taxonomy upgrade extras: jobDBAsupportWer sind wir?

    FromDual ist das führende unabhängige Beratungs- und Dienstleistungsunternehmen für MySQL, Galera Cluster, MariaDB und Percona Server in Europa mit Hauptsitz in der Schweiz.

    Unsere Kunden stammen hauptsächlich aus Europa und reichen vom kleinen Start-Up bis zur europäischen Top-500 Firma. Sie erhalten von uns Support bei Datenbank-Problemen, direkte Eingriffe als remote-DBA, Schulung für ihre DBAs und Entwickler sowie Beratung bei Architektur- und Design-Entscheidungen. Ausserdem entwickeln wir Tools rund um MySQL, schreiben Blog-Artikel und halten Vorträge bei Konferenzen.

    Da unsere qualitativ guten Dienstleistungen immer mehr Kunden anziehen, brauchen wir Kollegen (m/w), welche selbst und mit uns wachsen wollen.

    Stellenbeschreibung

    Wir suchen deutschsprachige Mitarbeiter (Sie oder Ihn) auf Junior- oder Senior-Level für Dienstleistungen rund um MySQL (hauptsächlich Support und remote-DBA Arbeiten) in Vollzeit. Primär solltest Du sicherstellen, dass die geschäftskritischen MySQL-Datenbanken unserer Kunden wie am Schnürchen laufen - und falls nicht, diese schnell wieder ans Laufen kriegen...


    Unser/e "Wunschkandidat/in"

    • hat Erfahrung im Betrieb kritischer und hoch verfügbarer produktiver Datenbanken hauptsächlich auf Linux,
    • kennt Replikation in allen Variationen aus der täglichen Arbeit,
    • weiß, wie die meist verbreiteten MySQL-HA-Setups funktionieren und wie man sie wieder effizient repariert, wenn ein Problem auftritt,
    • ist sattelfest in SQL,
    • bringt Erfahrung mit Galera Cluster mit,
    • kann Bash skripten und einfache Programme in mindestens einer verbreiteten Programmier-/Skripting-Sprache (PHP, Bash, ...) erstellen.

    Wir suchen Verstärkung, die von soliden Grundlagen aus auf dem Weg zu diesem Ideal ist.


    Was wir von Dir erwarten:

    • Kenntnisse in MySQL, Percona Server oder MariaDB oder Bereitschaft, sich diese anzueignen
    • wissen, wie man kritische Datenbank-Systeme betreibt
    • Verständnis, was beim Betrieb von Datenbanken falsch laufen kann
    • selbständige Arbeitsweise (remote) mit Kommunikation über IRC, Skype, Mail und Telefon
    • Kenntnisse des Linux Systems

    DBA- oder DevOps-Erfahrungen wären z.B. eine gute fachliche Basis.


    Du schätzt den direkten Kontakt mit Kunden, hast ein gutes Gespür für deren Probleme, kannst zuhören und findest schnell die eigentlichen Probleme. Du bist gewohnt, proaktiv zu handeln bevor etwas passiert, und führst den Kunden wieder auf den richtigen Pfad zurück.


    Um Deine Arbeit erledigen zu können, arbeitest Du in einer europäischen Zeitzone. Deine Arbeitszeit kannst Du, der betrieblichen Situation entsprechend, flexibel gestalten. Wir erwarten, dass Du Deinen Beitrag zum Bereitschaftsdienst leistest. FromDual hat voraussichtlich keine Büroräumlichkeiten in Deinem Wohnort. Ein Umzug ist jedoch nicht notwendig: Wir ermöglichen Dir das Arbeiten von zu Hause aus oder unterstützen Dich bei der Suche einer geeigneten Arbeitsräumlichkeit in Deiner Nähe. Gute schriftliche und mündliche Englischkenntnisse sind erforderlich.

    Was wir Dir bieten:
    • Deinen Leistungen angemessenes Gehalt.
    • Möglichkeit Dich zum Top MySQL-Datenbankspezialisten zu entwickeln.
    • Selbständiges Arbeiten.
    • Verantwortung für Deine Projekte und Kunden zu übernehmen.
    • Gute Kameradschaft im Team, sowie lockerer und angenehmer Umgang.
    • Stellenbezogene Weiterbildungsmöglichkeiten.
    • Teilnahme an Open Source Anlässen.
    • Arbeit von Deinem bevorzugten Wohnort aus.

    Du solltest in der Lage sein, die meiste Zeit selbständig zu arbeiten, zu denken und zu handeln und Dir neues Wissen selbständig anzueignen (durch Web-Suche, die MySQL-Dokumentation, Ausprobieren, etc.). Solltest Du dennoch einmal nicht weiterkommen, werden Dir Deine Kollegen von FromDual gerne helfen.


    Wenn Du jemanden brauchst, der Dir die ganze Zeit Dein Händchen hält, ist FromDual nicht die richtige Wahl.


    Wie geht es weiter

    Wenn Du an dieser Chance interessiert bist und Du denkst, dass Du die passende Kandidatin oder der passende Kandidat bist, würden wir uns freuen, von Dir zu hören. Wir wissen, dass niemand 100% auf diese Stellenbeschreibung passt!


    Bitte schicke Deinen ungeschönten Lebenslauf mit Deinen Gehaltsvorstellungen an jobs@fromdual.com. Wenn Du mehr über diese Stelle erfahren oder wenn Du mit mir persönlich sprechen möchtest, ruf mich bitte an unter +41 79 830 09 33 (Oli Sennhauser, CTO). Bitte nur Bewerber, KEINE Headhunter!


    Nachdem Du uns Deinen Lebenslauf zugeschickt hast, darfst Du Deine Fähigkeiten in einem kleinen MySQL-Test unter Beweis zu stellen. Nach bestandenem Test laden wir Dich für die finalen Interviews ein.

    Controlling worldwide manufacturing plants with MySQL

    Shinguz - Thu, 2015-05-14 21:43
    Taxonomy upgrade extras: multi-source replicationmysql-replicationreplicationMulti-Master Replicationfan-in replicationrow filteringGTID

    A MySQL customer of FromDual has different manufacturing plants spread across the globe. They are operated by local companies. FromDuals customer wants to maintain the manufacturing receipts centralized in a MySQL database in the Head Quarter in Europe. Each manufacturing plant should only see their specific data.

    Manufacturing log information should be reported backup to European Head Quarter MySQL database.

    The process was designed as follows:

    Preparation of Proof of Concept (PoC)

    To simulate all cases we need different schemas. Some which should be replicated, some which should NOT be replicated:

    CREATE DATABASE finance; CREATE TABLE finance.accounting ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `data` varchar(255) DEFAULT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `data_rename` (`data`) ); CREATE DATABASE crm; CREATE TABLE crm.customer ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `data` varchar(255) DEFAULT NULL, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `data_rename` (`data`) ); CREATE DATABASE erp; -- Avoid specifying Storage Engine here!!! CREATE TABLE erp.manufacturing_data ( id INT UNSIGNED NOT NULL AUTO_INCREMENT , manufacture_plant VARCHAR(32) , manufacture_info VARCHAR(255) , PRIMARY KEY (id) , KEY (manufacture_plant) ); CREATE TABLE erp.manufacturing_log ( id INT UNSIGNED NOT NULL AUTO_INCREMENT , manufacture_plant VARCHAR(32) , log_data VARCHAR(255) , PRIMARY KEY (id) , KEY (manufacture_plant) );
    MySQL replication architecture

    Before you start with such complicated MySQL set-ups it is recommended to make a little sketch of what you want to build:

    Preparing the Production Master database (Prod M1)

    To make use of all the new and cool features of MySQL we used the new GTID replication. First we set up a Master (Prod M1) and its fail-over System (Prod M2) in the customers Head Quarter:

    # /etc/my.cnf [mysqld] binlog_format = row # optional log_bin = binary-log # mandatory, also on Slave! log_slave_updates = on # mandatory gtid_mode = on # mandatory enforce_gtid_consistency = on # mandatory server-id = 39 # mandatory

    This step requires a system restart (one minute downtime).

    Preparing the Production Master standby database (Prod M2)

    On Master (Prod M1):

    GRANT REPLICATION SLAVE ON *.* TO 'replication'@'192.168.1.%' IDENTIFIED BY 'secret'; mysqldump -u root --set-gtid-purged=on --master-data=2 --all-databases --triggers --routines --events > /tmp/full_dump.sql

    On Slave (Prod M2):

    CHANGE MASTER TO MASTER_HOST='192.168.1.39', MASTER_PORT=3306 , MASTER_USER='replication', MASTER_PASSWORD='secret' , MASTER_AUTO_POSITION=1; RESET MASTER; -- On SLAVE! system mysql -u root < /tmp/full_dump.sql START SLAVE;

    To make it easier for a Slave to connect to its master we set a VIP in front of those 2 database servers (VIP Prod). This VIP should be used by all applications in the head quarter and also the filter engines.

    Set-up filter engines (Filter BR and Filter CN)

    To make sure every manufacturing plant sees only the data it is allowed to see we need a filtering engine between the production site and the manufacturing plant (Filter BR and Filter CN).

    To keep this filter engine lean we use a MySQL instance with all tables converted to the Blackhole Storage Engine:

    # /etc/my.cnf [mysqld] binlog_format = row # optional log_bin = binary-log # mandatory, also on Slave! log_slave_updates = on # mandatory gtid_mode = on # mandatory enforce_gtid_consistency = on # mandatory server-id = 36 # mandatory default_storage_engine = blackhole

    On the production master (Prod M1) we get the data as follows:

    mysqldump -u root --set-gtid-purged=on --master-data=2 --triggers --routines --events --no-data --databases erp > /tmp/erp_dump_nd.sql

    The Filter Engines (Filter BR and CN) are set-up as follows::

    -- Here we can use the VIP! CHANGE MASTER TO master_host='192.168.1.33', master_port=3306 , master_user='replication', master_password='secret' , master_auto_position=1; RESET MASTER; -- On SLAVE! system cat /tmp/erp_dump_nd.sql | sed 's/ ENGINE=[a-zA-Z]*/ ENGINE=blackhole/' | mysql -u root START SLAVE;

    Do not forget to also create the replication user on the filter engines.

    GRANT REPLICATION SLAVE ON *.* TO 'replication'@'192.168.1.%' IDENTIFIED BY 'secret';
    Filtering out all non ERP schemata

    We only want the erp schema to be replicated to the manufacturing plants, not the crm or the finance application. This we achieve with the following option on the filter engines:

    # /etc/my.cnf [mysqld] replicate_do_db = erp replicate_ignore_table = erp.manufacturing_log
    MySQL row filtering

    To achieve row filtering we use TRIGGERS. Make sure they are not replicated further down the hierarchy:

    SET SESSION SQL_LOG_BIN = 0; use erp DROP TRIGGER IF EXISTS filter_row; delimiter // CREATE TRIGGER filter_row BEFORE INSERT ON manufacturing_data FOR EACH ROW BEGIN IF ( NEW.manufacture_plant != 'China' ) THEN SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Row was filtered out.' , CLASS_ORIGIN = 'FromDual filter trigger' , SUBCLASS_ORIGIN = 'filter_row' , CONSTRAINT_SCHEMA = 'erp' , CONSTRAINT_NAME = 'filer_row' , SCHEMA_NAME = 'erp' , TABLE_NAME = 'manufacturing_data' , COLUMN_NAME = '' , MYSQL_ERRNO = 1644 ; END IF; END; // delimiter ; SET SESSION SQL_LOG_BIN = 0;

    Up to now this would cause to stop replication for every filtered row. To avoid this we tell the Filtering Slaves to skip this error number:

    # /etc/my.cnf [mysqld] slave_skip_errors = 1644
    Attaching production manufacturing Slaves (Man BR M1 and Man CN M1)

    When we have finished everything on our head quarter site. We can start with the manufacturing sites (BR and CN):

    On Master (Prod M1):

    mysqldump -u root --set-gtid-purged=on --master-data=2 --triggers --routines --events --where='manufacture_plant="Brazil"' --databases erp > /tmp/erp_dump_br.sql mysqldump -u root --set-gtid-purged=on --master-data=2 --triggers --routines --events --where='manufacture_plant="China"' --databases erp > /tmp/erp_dump_cn.sql

    On the Manufacturing Masters (Man BR M1 and Man BR M2). Here we do NOT use a VIP because we think a blackhole storage engine is robust enough as master:

    CHANGE MASTER TO master_host='192.168.1.43', master_port=3306 , master_user='replication', master_password='secret' , master_auto_position=1; RESET MASTER; -- On SLAVE! system cat /tmp/erp_dump_br.sql | mysql -u root START SLAVE;

    The standby manufacturing (Man BR M2 and Man CN M2) database is created in the same way as the production manufacturing database on the master.

    Testing replication from HQ to manufacturing plants

    First we make sure, crm and finance is not replicated out and replication also does not stop (on Prod M1):

    INSERT INTO finance.accounting VALUES (NULL, 'test data over VIP', NULL); INSERT INTO finance.accounting VALUES (NULL, 'test data over VIP', NULL); INSERT INTO crm.customer VALUES (NULL, 'test data over VIP', NULL); INSERT INTO crm.customer VALUES (NULL, 'test data over VIP', NULL); UPDATE finance.accounting SET data = 'Changed data'; UPDATE crm.customer SET data = 'Changed data'; DELETE FROM finance.accounting WHERE id = 1; DELETE FROM crm.customer WHERE id = 1; SELECT * FROM finance.accounting; SELECT * FROM crm.customer; SHOW SLAVE STATUS\G

    The schema filter seems to work correctly. Then we check if also the row filter works correctly. For this we have to run the queries in statement based replication (SBR)! Otherwise the trigger would not fire:

    use mysql INSERT INTO erp.manufacturing_data VALUES (NULL, 'China', 'Highly secret manufacturing info as RBR.'); INSERT INTO erp.manufacturing_data VALUES (NULL, 'Brazil', 'Highly secret manufacturing info as RBR.'); -- This needs SUPER privilege... :-( SET SESSION binlog_format = STATEMENT; -- Caution those rows will NOT be replicated!!! -- See filter rules for SBR INSERT INTO erp.manufacturing_data VALUES (NULL, 'China', 'Highly secret manufacturing info as SBR lost.'); INSERT INTO erp.manufacturing_data VALUES (NULL, 'Brazil', 'Highly secret manufacturing info as SBR lost.'); use erp INSERT INTO manufacturing_data VALUES (NULL, 'China', 'Highly secret manufacturing info as SBR.'); INSERT INTO manufacturing_data VALUES (NULL, 'Brazil', 'Highly secret manufacturing info as SBR.'); INSERT INTO manufacturing_data VALUES (NULL, 'Germany', 'Highly secret manufacturing info as SBR.'); INSERT INTO manufacturing_data VALUES (NULL, 'Switzerland', 'Highly secret manufacturing info as SBR.'); SET SESSION binlog_format = ROW; SELECT * FROM erp.manufacturing_data;
    Production data back to head quarter

    Now we have to take care about the production data on their way back to the HQ. To achieve this we use the new MySQL 5.7 feature called multi source replication. For multi source replication the replication repositories must be kept in tables instead of files:

    # /etc/my.cnf [mysqld] master_info_repository = TABLE # mandatory relay_log_info_repository = TABLE # mandatory

    Then we have to configure 2 replication channels from Prod M1 to their specific manufacturing masters over the VIP (VIP BR and VIP CN):

    CHANGE MASTER TO MASTER_HOST='192.168.1.98', MASTER_PORT=3306 , MASTER_USER='replication', MASTER_PASSWORD='secret' , MASTER_AUTO_POSITION=1 FOR CHANNEL "manu_br"; CHANGE MASTER TO MASTER_HOST='192.168.1.99', MASTER_PORT=3306 , MASTER_USER='replication', MASTER_PASSWORD='secret' , MASTER_AUTO_POSITION=1 FOR CHANNEL "manu_cn"; START SLAVE FOR CHANNEL 'manu_br'; START SLAVE FOR CHANNEL 'manu_cn'; SHOW SLAVE STATUS FOR CHANNEL 'manu_br'\G SHOW SLAVE STATUS FOR CHANNEL 'manu_cn'\G

    Avoid to configure and activate the channels on Prod M2 as well.

    Testing back replication from manufacturing plants

    Brazil on Man BR M1:

    INSERT INTO manufacturing_log VALUES (1, 'Production data from Brazil', 'data');

    China on Man CN M1:

    INSERT INTO manufacturing_log VALUES (2, 'Production data from China', 'data');

    For testing:

    SELECT * FROM manufacturing_log;

    Make sure you do not run into conflicts (Primary Key, AUTO_INCREMENTS). Make sure filtering is defined correctly!

    To check the different channel states you can use the following command:

    SHOW SLAVE STATUS\G or SELECT ras.channel_name, ras.service_state AS 'SQL_thread', ras.remaining_delay , CONCAT(user, '@', host, ':', port) AS user , rcs.service_state AS IO_thread, REPLACE(received_transaction_set, '\n', '') AS received_transaction_set FROM performance_schema.replication_applier_status AS ras JOIN performance_schema.replication_connection_configuration AS rcc ON rcc.channel_name = ras.channel_name JOIN performance_schema.replication_connection_status AS rcs ON ras.channel_name = rcs.channel_name ;
    Troubleshooting Inject empty transaction

    If you try to skip a transaction as you did earlier (SQL_SLAVE_SKIP_COUNTER) you will face some problems:

    STOP SLAVE; ERROR 1858 (HY000): sql_slave_skip_counter can not be set when the server is running with @@GLOBAL.GTID_MODE = ON. Instead, for each transaction that you want to skip, generate an empty transaction with the same GTID as the transaction

    To skip the next transaction you have find the ones applied so far:

    SHOW SLAVE STATUS\G ... Executed_Gtid_Set: c3611091-f80e-11e4-99bc-28d2445cb2e9:1-20

    then tell MySQL to skip this by injecting a new empty transaction:

    SET SESSION GTID_NEXT='c3611091-f80e-11e4-99bc-28d2445cb2e9:21'; BEGIN; COMMIT; SET SESSION GTID_NEXT='AUTOMATIC'; SHOW SLAVE STATUS\G ... Executed_Gtid_Set: c3611091-f80e-11e4-99bc-28d2445cb2e9:1-21 START SLAVE;
    Revert from GTID-based replication to file/position-based replication

    If you want to fall-back from MySQL GTID-based replication to file/position-based replication this is quite simple:

    CHANGE MASTER TO MASTER_AUTO_POSITION = 0;
    MySQL Support and Engineering

    If you need some help or support our MySQL support and engineering team is happy to help you.

    Logging Galera Cluster conflicts

    Shinguz - Sat, 2015-04-11 12:30
    Taxonomy upgrade extras: logginggaleraclusterconflictdeadlockerror logerror

    We typically suggest our customers to use our MySQL/Galera Cluster my.cnf configuration template to avoid MySQL configuration and performance problems.

    And we are paranoid as well. Thus we enable all useful logging:

    wsrep_log_conflicts = 1

    But this has also some consequences of more visibility...

    If you monitor carefully your Galera Cluster for example with the FromDual Performance Monitor for MySQL and MariaDB, you might probably see some strange values increasing from time to time:

    mysql< SHOW GLOBAL STATUS LIKE 'wsrep_local_%r_s'; +---------------------------+-------+ | Variable_name | Value | +---------------------------+-------+ | wsrep_local_cert_failures | 42 | | wsrep_local_bf_aborts | 13 | +---------------------------+-------+

    Those values are indicators that some transactions (Galera write sets) did to not succeed and were aborted by Galera. In this case the paranoid logging helps to find, what exactly was aborted and possibly helps to find out, if this can or should be fixed:

    150410 1:44:18 [Note] WSREP: cluster conflict due to certification failure for threads: 150410 1:44:18 [Note] WSREP: Victim thread: THD: 151856, mode: local, state: executing, conflict: cert failure, seqno: 30399304 SQL: UPDATE login SET lTsexpire = UNIX_TIMESTAMP(NOW()) + lTimeout WHERE lSessionId = 'va3ta7besku82k56ncv3bnhlj5' *** Priority TRANSACTION: TRANSACTION 464359568, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 1 lock struct(s), heap size 360, 0 row lock(s) MySQL thread id 4, OS thread handle 0x7f1c0916c700, query id 8190690 Update_rows_log_event::find_row(30399302) *** Victim TRANSACTION: TRANSACTION 464359562, ACTIVE 0 sec mysql tables in use 1, locked 1 2 lock struct(s), heap size 360, 1 row lock(s), undo log entries 1 MySQL thread id 151856, OS thread handle 0x7f1c09091700, query id 8190614 172.20.100.11 sam_angiz query end UPDATE login SET lTsexpire = UNIX_TIMESTAMP(now()) + lTimeout WHERE lSessionId = 'va3ta7besku82k56ncv3bnhlj5' *** WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 835205 page no 3 n bits 72 index `PRIMARY` of table `fromdual`.`login` trx table locks 1 total table locks 2 trx id 464359562 lock_mode X locks rec but not gap lock hold time 0 wait time before grant 0 150410 1:44:18 [Note] WSREP: cluster conflict due to high priority abort for threads: 150410 1:44:18 [Note] WSREP: Winning thread: THD: 4, mode: applier, state: executing, conflict: no conflict, seqno: 30399302 SQL: (null) 150410 1:44:18 [Note] WSREP: Victim thread: THD: 151856, mode: local, state: committing, conflict: no conflict, seqno: -1 SQL: UPDATE login SET lTsexpire = UNIX_TIMESTAMP(now()) + lTimeout WHERE lSessionId = 'va3ta7besku82k56ncv3bnhlj5'

    In the above Galera conflict 2 login transactions where running at the same time. They both come with the same Session ID and want to update the expiry timestamp. Now how to solve or fix this:

    • First check, if this table has a Primary Key (tables without a PK causes full table scans which can last for long time, increasing the chance for conflicts).
    • Second check, if there is a (UNIQUE?) index on lSessionId. A missing index leads to full table scans which increases the chance for conflicts.
    • Third check WHY 2 logins from the same Session ID can arrive at the same time (within 1 second) on 2 different Galera nodes (Ajax requests, etc...). Try to avoid such situations.

    Galera Cluster last inactive check and VMware snapshots

    Shinguz - Sat, 2015-04-11 11:46
    Taxonomy upgrade extras: galeravmwaresnapshot

    From time to time we see at Galera Cluster customer engagements the following, for me scary, warning in the MySQL error log:

    [Warning] WSREP: last inactive check more than PT1.5S ago (PT7.06159S), skipping check

    We mostly see this in VMware set-ups. Some further enquiry with the Galera developers did not give a satisfying answer:

    This can be seen on bare metal as well - with poorly configured mysqld, O/S, or simply being overloaded. All it means is that this thread could not get CPU time for 7.1 seconds. You can imagine that access to resources in virtual machines is even harder (especially I/O) than on bare metal, so you will see this in virtual machines more often.

    This is not a Galera specific issue (it just reports being stuck, other mysqld threads are equally stuck) so there is no configuration options for that. You simply must make sure that your system and mysqld are properly configured, that there is enough RAM (buffer pool not over provisioned), that there is swap, that there are proper I/O drivers installed on guest and so on.

    Basically, Galera runs in virtual machines as well as well virtual machines approximates bare metal.

    We were still suspecting that this is somehow VMware related. This week we had the chance to investigate... At 01:36 am node Galera2 lost connection to the Cluster and became NON-PRIMARY. This is basically a bad sign:

    150401 1:36:15 [Warning] WSREP: last inactive check more than PT1.5S ago (PT5.08325S), skipping check 150401 1:36:15 [Note] WSREP: (09c6b2f2, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.42.2:4567 150401 1:36:16 [Note] WSREP: view(view_id(NON_PRIM,09c6b2f2,30) memb { 09c6b2f2,0 } joined { } left { } partitioned { ce6bf2e1,0 d1f9bee0,0 }) 150401 1:36:16 [Note] WSREP: view(view_id(NON_PRIM,09c6b2f2,31) memb { 09c6b2f2,0 } joined { } left { } partitioned { ce6bf2e1,0 d1f9bee0,0 }) 150401 1:36:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 150401 1:36:16 [Note] WSREP: Flow-control interval: [16, 16] 150401 1:36:16 [Note] WSREP: Received NON-PRIMARY. 150401 1:36:16 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 26304132) 150401 1:36:16 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1 150401 1:36:16 [Note] WSREP: Flow-control interval: [16, 16] 150401 1:36:16 [Note] WSREP: Received NON-PRIMARY. 150401 1:36:16 [Warning] WSREP: Send action {(nil), 328, TORDERED} returned -107 (Transport endpoint is not connected) 150401 1:36:16 [Note] WSREP: New cluster view: global state: dcca768c-b5ad-11e3-bbc0-fb576fb3c451:26304132, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3 150401 1:36:17 [Note] WSREP: (09c6b2f2, 'tcp://0.0.0.0:4567') reconnecting to d1f9bee0 (tcp://192.168.42.1:4567), attempt 0

    I suspected, after some investigation with the FromDual Performance Monitor for MySQL and MariaDB, that the database backup (mysqldump) could be the reason. It was not. But the customer explained, that after the database backup they do a VMware snapshot.

    And when we compared our problem with the backup log file:

    2015/04/01 01:35:08 [3] backup.fromdual.com: Creating a snapshot of galera3 2015/04/01 01:35:16 [3] backup.fromdual.com: Created a snapshot of galera3 2015/04/01 01:35:23 [3] backup.fromdual.com: galera3: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015/04/01 01:36:10 [3] backup.fromdual.com: galera3: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015/04/01 01:36:10 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera3

    we can see that our problem pretty much started with the end of the WMware snapshot (01:36:10 + 5.08 = 1:36:15). By the way: For such kind of investigations it is always good to have a ntp daemon for time synchronization running. Otherwise problem investigation becomes much harder...

    Some more and deeper investigation shows that we loose from time to time nodes during VMware snapshots (galera3). But they recover quickly because they can do an IST. In worst case we can loose 2 nodes and then the whole Galera Cluster has gone.

    192.168.42.3 / node Galera3 2015-04-10 01:44:00 [3] backup.fromdual.com: Creating a snapshot of galera3 2015-04-10 01:44:08 [3] backup.fromdual.com: Created a snapshot of galera3 2015-04-10 01:44:15 [3] backup.fromdual.com: galera3: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015-04-10 01:45:39 [3] backup.fromdual.com: galera3: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015-04-10 01:45:39 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera3
    150410 1:44:07 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 tcp://galera2:4567 150410 1:44:07 [Warning] WSREP: last inactive check more than PT1.5S ago (PT7.06159S), skipping check 150410 1:44:08 [Note] WSREP: Received NON-PRIMARY. 150410 1:44:10 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 30399299) 150410 1:44:11 [Warning] WSREP: Gap in state sequence. Need state transfer. 150410 1:44:11 [Note] WSREP: Prepared IST receiver, listening at: tcp://galera3:4568 150410 1:44:11 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor. 150410 1:44:11 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 30399309) 150410 1:44:11 [Note] WSREP: Requesting state transfer: success, donor: 2 150410 1:44:11 [Note] WSREP: 2.0 (galera2): State transfer to 0.0 (galera3) complete. 150410 1:44:11 [Note] WSREP: Member 2.0 (galera2) synced with group. 150410 1:44:11 [Note] WSREP: Receiving IST: 8 writesets, seqnos 30399291-30399299 150410 1:44:11 [Note] WSREP: IST received: dcca768c-b5ad-11e3-bbc0-fb576fb3c451:30399299 150410 1:44:11 [Note] WSREP: 0.0 (galera3): State transfer from 2.0 (galera2) complete. 150410 1:44:11 [Note] WSREP: Shifting JOINER -> JOINED (TO: 30399309) 150410 1:44:11 [Note] WSREP: Member 0.0 (galera3) synced with group. 150410 1:44:11 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 30399309) 150410 1:44:11 [Note] WSREP: Synchronized with group, ready for connections 150410 1:44:13 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:45:42 [Warning] WSREP: last inactive check more than PT1.5S ago (PT2.47388S), skipping check 150410 1:45:43 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 tcp://galera2:4567 150410 1:45:44 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0 150410 1:45:44 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to c9d964d3 (tcp://galera2:4567), attempt 0 150410 1:45:48 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:47:26 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 150410 1:47:27 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0 150410 1:47:31 [Note] WSREP: (158f71de, 'tcp://0.0.0.0:4567') turning message relay requesting off
    192.168.42.1 / node Galera1 2015-04-10 01:47:24 [3] backup.fromdual.com: Creating a snapshot of galera1 2015-04-10 01:47:29 [3] backup.fromdual.com: Created a snapshot of galera1 2015-04-10 01:47:40 [3] backup.fromdual.com: galera1: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015-04-10 01:48:43 [3] backup.fromdual.com: galera1: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015-04-10 01:48:44 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera1 150410 1:44:02 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:44:04 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:44:12 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor. 150410 1:45:43 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:45:44 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:45:48 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:47:27 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.66452S), skipping check 150410 1:47:27 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:47:30 [Note] WSREP: (54de92f8, 'tcp://0.0.0.0:4567') turning message relay requesting off
    192.168.42.2 / node Galera2 2015-04-10 02:09:55 [3] backup.fromdual.com: Creating a snapshot of galera2 2015-04-10 02:09:58 [3] backup.fromdual.com: Created a snapshot of galera2 2015-04-10 02:10:05 [3] backup.fromdual.com: galera2: backup the changed blocks of disk 'Festplatte 1' using NBD transport 2015-04-10 02:10:53 [3] backup.fromdual.com: galera2: saving the Change Block Tracking's reference for disk 'Festplatte 1' 2015-04-10 02:10:54 [3] backup.fromdual.com: Removing Arkeia's snapshot of galera2
    150410 1:44:02 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:44:03 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:44:08 [Warning] WSREP: discarding established (time wait) 158f71de (tcp://192.168.42.3:4567) 150410 1:44:11 [Note] WSREP: Member 0.0 (galera3) requested state transfer from '*any*'. Selected 2.0 (galera2)(SYNCED) as donor. 150410 1:44:13 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:45:43 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera3:4567 150410 1:45:44 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 158f71de (tcp://galera3:4567), attempt 0 150410 1:45:48 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 1:47:26 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://galera1:4567 150410 1:47:27 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') reconnecting to 54de92f8 (tcp://galera1:4567), attempt 0 150410 1:47:30 [Note] WSREP: (c9d964d3, 'tcp://0.0.0.0:4567') turning message relay requesting off 150410 2:09:57 [Warning] WSREP: last inactive check more than PT1.5S ago (PT1.83618S), skipping check

    The backups are done with the 2 options:

    enabled.

    Possibly this is the reason and one should disable those features in combination with Galera. Further investigation is going on. In worst case VMware snapshotting with Galera should be avoided.

    Rename MySQL Partition

    Shinguz - Fri, 2015-03-06 15:00
    Taxonomy upgrade extras: partitionrenameDDL

    Before I forget it and have to search again here a short note about how to rename a MySQL Partition:

    My dream:

    ALTER TABLE history RENAME PARTITION p2015_kw10 INTO p2015_kw09;
    In reality: ALTER TABLE history REORGANIZE PARTITION p2015_kw10 INTO ( PARTITION p2015_kw09 VALUES LESS THAN (UNIX_TIMESTAMP('2015-03-02 00:00:00')) );

    Caution: REORGANIZE PARTITION causes a full copy of the whole partition!

    Hint: I assume it would be very easy for MySQL or MariaDB to make this DDL command an in-place operation...

    MySQL Partitioning was introduced in MySQL 5.1.

    MySQL Enterprise Incremental Backup simplified

    Shinguz - Wed, 2015-02-25 19:41
    Taxonomy upgrade extras: mebMySQL Enterprise BackupenterpriseBackupincremental backup

    MySQL Enterprise Backup (MEB) has the capability to make real incremental (differential and cumulative?) backups. The actual releases are quite cool and you should really look at it...

    Unfortunately the original MySQL documentation is much too complicated for my simple mind. So I did some testing and simplified it a bit for our customers...

    If you want to dive into the original documentation please look here: Making an Incremental Backup .

    If you want to use MySQL Enterprise Backup please let us know and we send you a quote...

    Prepare MySQL Backup infrastructure mkdir /backup/full /backup/incremental1 /backup/incremental2
    Full MySQL Backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full apply-log
    First MySQL Incremental Backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --incremental --incremental-base=dir:/backup/full --incremental-backup-dir=/backup/incremental1 backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full --incremental-backup-dir=/backup/incremental1 apply-incremental-backup
    Second MySQL Incremental Backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --incremental --incremental-base=dir:/backup/full --incremental-backup-dir=/backup/incremental2 backup mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full --incremental-backup-dir=/backup/incremental2 apply-incremental-backup

    and so on...

    MySQL Restore mysqlbackup --defaults-file=/etc/my.cnf --user=root --backup-dir=/backup/full copy-back

    Have fun with MySQL Enterprise Backup. If you need any help with your MySQL Backup concept, please let us know.

    Creating Event Handlers with MySQL Enterprise Monitor

    Shinguz - Tue, 2015-02-17 13:57
    Taxonomy upgrade extras: MySQL Enterprise Monitormonitoringeventhandlermpmperformance monitor

    MySQL Enterprise Monitor (MEM) has by default no Event Handlers created and activated. These Event Handlers you have to define yourself according to your needs.

    In this article we discuss how to create MySQL Enterprise Monitor Event Handlers with MEM v.3.0.18. For other (older) versions the steps may vary...

    Task: Event Handler for maximum Connections reached

    We would like to be notified by MySQL Enterprise Monitor when the number of connections is near to max_connections.

    For this we search first which Advisors are available at all: Configuration -> Advisors -> Availability.


    Here we can see that we have an Advisor called Maximum Connection Limit Nearing Or Reached which is scheduled for every 5 minutes and has thresholds at 75, 85, 95 and 100%:


    Now we know which Advisor should create and Event. As a next step we have to create and Event which should be triggered: Configuration -> Event Handling -> Create Event Handler.


    Here we can create and Event with all its needed configuration: Events -> All -> server.


    If we look at the Events we can even see the detailed description and how the values for the Event are collected:


    Task: Event Handler for used disk space

    For this Event Handler we need the Advisor Filesystem Free Space under Operating System:


    In this advisor we can configure the Threshold as well:


    In the Event Handler we can define which Assets shall be monitored. For example the mountpoint: /.


    Local disks can only be monitored, if a local MySQL Enterprise Monitor Agent is installed. An agent-less MySQL Enterprise Monitor cannot monitor local disk resources...

    Have fun using the MySQL Enterprise Monitor. If you need any help in installing or configuring MEM do not hesitate to contact us.

    All these functions are also implemented in the FromDual Performance Monitor for MySQL. If you want to relay on Open Source technology only you should consider our Performance Monitor.

    Nagios and Icinga plug-ins for MySQL 1.0.0 have been released

    Shinguz - Wed, 2015-02-04 22:02
    Taxonomy upgrade extras: nagiosicingaplug-inmonitorperformancealert

    FromDual has the pleasure to announce the release of the new version 1.0.0 of its widely used Nagios and Icinga plug-ins for MySQL, Galera Cluster, MariaDB and Percona Server.

    All plug-ins are basically renewed and should now work all correctly.

    The new Nagios/Icinga plug-ins can be downloaded here.

    In the inconceivable case that you find a bug in the Nagios/Icinga plug-ins please report it to our bug tracker.

    Any feedback, statements and testimonials are welcome as well! Please send them to feedback@fromdual.com.

    Description of the current functionality

    Details about the functionality and the usage of each plug-in you get with the option: --help.

    The following Nagios/Icinga plug-in for MySQL and MariaDB are currently available:

    check_db_mysql.pl

    This Nagios/Icinga plug-in alerts you if your MySQL database is not up and running.

    check_errorlog_mysql.pl and errorLogFilterRules.pm

    This Nagios/Icinga plug-in alerts you if it finds some suspicious messages in the MySQL error log.
    The rules which messages should be ignored can be found in the file errorLogFilterRules.pm. If you want to add your own filter rules please add them in this file as well.

    check_galera_nodes.pl

    This Nagios/Icinga plug-in alerts you if the actual number of nodes in your a Galera Cluster is not the expected one.

    check_repl_mysql_cnt_slave_hosts.pl

    This Nagios/Icinga plug-in alerts you if your MySQL Slaves have not reported to their Master properly their existence with the report_host variable.

    check_repl_mysql_heartbeat.pl

    This Nagios/Icinga plug-in alerts you if your MySQL Slave is too many heartbeats behind its Master.

    check_repl_mysql_io_thread.pl

    This Nagios/Icinga plug-in alerts you if your MySQL Slaves IO thread is not up an running.

    check_repl_mysql_read_exec_pos.pl

    This Nagios/Icinga plug-in alerts you if your MySQL Slaves read and execution positions differ too much.

    check_repl_mysql_readonly.pl

    This Nagios/Icinga plug-in alerts you if your MySQL Slave is NOT set to readonly.

    check_repl_mysql_seconds_behind_master.pl

    This Nagios/Icinga plug-in alerts you if your MySQL Slave falls too many seconds behind its Master.

    check_repl_mysql_sql_thread.pl

    This Nagios/Icinga plug-in alerts you if your Slaves SQL thread is not up an running.

    perf_mysql.pl

    This Nagios/Icinga plug-in gathers MySQL and MariaDB performance data.

    Changes in FromDual Nagios/Icinga plug-ins 1.0.0 All plug-ins
    • Usage was improved. The usage can be shown with the --help option.
    • Usage states now which GRANT privileges are needed for a specific plug-in.
    • Examples added how to use each plug-in.
    • Default socket location moved from /tmp/mysql.sock to /var/run/mysqld/mysqld.sock.
    • New host/socket convention implemented in all scripts similar to MySQL client tools.
    • -epn tag added for Icinga.
    check_errorlog_mysql.pl
    • Some bugs fixed.
    • More filtering rules added.
    • Filtering rules separated into own file.
    • Entry point finding problem fixed.
    check_repl_mysql_heartbeat.pl
    • Script name fixed.
    check_db.pl
    • Unknown command problem with Galera Cluster caught.
    • mysqladmin ping removed and implemented in Perl.
    Support and Subscription for commercial use

    For subscriptions for commercial use of this software please get in contact with us.

    Linuxtag: Knowledge and People - and New Colleagues?

    Jörg Brühe - Wed, 2015-02-04 10:49

    At FromDual, we are currently preparing for our participation in the "Chemnitzer Linux-Tage" in March.
    While we don't yet know whether the programme committee accepted our proposed talks, we will have a booth and hope for interesting exchanges with others from the MySQL, database, Linux, ... world. Of course, we will also mention that we are looking for additional colleagues - there are so many tasks that we need more people to handle them all. (In case you got curious, look here: http://www.fromdual.com/mysql-dba-2014-12-de )

    While I attended the Chemnitzer Linux-Tage already last year (thank you, Frank Hofmann, for recommending them!), it will be the first time there for FromDual as a company. I enjoyed the informal atmosphere, which made it easy to get into contact with new people, to learn, and to discuss both facts and opinions.

    To those of you who didn't yet attend such events, I can only say: You are missing some great experiences! It is good to get stimulated by others, by their problems and their experiences, so that one's thoughts are broadened and start considering new fields. Often, they lead to insights that are helpful for one's own daily work.

    In the opposite direction, we at FromDual hope that during the Linux-Tage we can help others to make (better) use of MySQL in all its variants, with its surrounding tools and products, so that everybody will profit.

    The Chemnitzer Linux-Tage will take place on March 21 and 22, 2015, in Chemnitz, Germany. Parts of the programme will probably be delivered in English, so even those of you who don't understand German might enjoy coming there. If you now consider a visit, you will find further information here:
    https://chemnitzer.linux-tage.de/2015/de

    We will be happy to meet you in Chemnitz - to talk about whatever interests both you and us, maybe even about you joining us?

    Download MySQL Enterprise Features

    Shinguz - Mon, 2015-02-02 21:47
    Taxonomy upgrade extras: enterprise monitorBackupworkbenchenterprise

    MySQL provides some great enterprise features beside the MySQL Server. The ones we are asked the most at customers are:

    • MySQL Enterprise Backup (MEB)
    • MySQL Enterprise Monitor (MEM) and
    • MySQL Enterprise Workbench (MWB)
    MySQL Enterprise Backup (MEB)

    MySQL Enterprise Backup (MEB) is an alternative to the mysqldump backup utility. Its big advantage is its fast backup but even faster restore performance. This is a must for all MySQL users having bigger databases than let's say 10 to 20 Gigabytes and/or having hard requirements for restore times (MTTR).

    Last implementation tests we did with a customer for an about 30 Gbyte database were:

     MEBmysqldumpBackup10 minutes18 minutesRestore12 minutes80 minutes

    If you need our help implementing MySQL Enterprise Backup into your backup infrastructure please get in contact with us. MySQL Enterprise Backup also seamlessly integrates into the FromDual Backup Manager for MySQL.

    MySQL Enterprise Monitor (MEM)

    MySQL Enterprise Monitor (MEM) is an Enterprise Monitoring Solution for MySQL which Monitors your business critical MySQL databases. Various predefined advisors rise an alert if something with your precious MySQL database does not work as expected.

    Our alternative competitive product is the FromDual Performance Monitor for MySQL.

    MySQL Enterprise Workbench (MWB)

    MySQL Enterprise Workbench (MWB) is the tool modern MySQL Database administrators use to operate their MySQL databases. Old fashion ones still use the CLI... MySQL developers can easily write and test database queries and develop their data model with the ER diagram modeller.

    Download Enterprise tools

    But how can we get now to these precious tools? This is quite easy following the screen shots below:

    As a fist step you go to www.mysql.com/downloads:


    Here you can find a link to Oracle eDelivery which is the MySQL/Oracle download facility. Then you get to the welcome screen:


    Before you get access to the software you have to Sign In with your Oracle/MySQL customer account if you have one. If you do not have an account yet you can Create an Account to get to the software:


    Then you have to agree (2 times) to the Terms & Restrictions (this is what Oracle is really good in). Once to the Oracle Trial License Agreement and once to the Export Restrictions:


    Then you get to the Media Pack Search. Here you can define what product you are interested in and on which platform you are using it. Unfortunately a sub-product filter cannot be chosen. So you get a long list to pick your final package from:


    An last you can download your product of desire:


    Unfortunately the packages get some silly names like V59684-01.zip instead of meaningful names. But with the following command you get some information what is included in the package:

    unzip -l V59684-01.zip Archive: V59684-01.zip Length Date Time Name --------- ---------- ----- ---- 2958631 2014-11-04 13:29 meb-3.11.1-linux-glibc2.5-x86-64bit.tar.gz 185 2014-11-05 08:30 meb-3.11.1-linux-glibc2.5-x86-64bit.tar.gz.asc 77 2014-11-04 13:29 meb-3.11.1-linux-glibc2.5-x86-64bit.tar.gz.md5 2130 2014-11-05 15:06 README.txt --------- ------- 2961023 4 files

    Have fun trying the MySQL Enterprise Features. If you need any help installing or integrating them into your infrastructure, do not hesitate to contact FromDual.

    MySQL table Point-in-Time-Recovery from mysqldump backup

    Shinguz - Sun, 2015-01-25 19:42
    Taxonomy upgrade extras: BackupRestoreRecoverymysqldumppoint-in-time-recoverypitr

    Sometimes we face the situation where we have a full MySQL database backup done with mysqldump and then we have to restore and recover just one single table out of our huge mysqldump file.
    Further our mysqldump backup was taken hours ago so we want to recover all the changes on that table since our backup was taken up to the end.

    In this blog article we cover all the steps needed to achieve this goal for MySQL and MariaDB.

    Recommendation: It is recommended to do theses steps on a testing system and then dump and restore your table back to the production system. If you do it directly on your production system you have to know exactly what you are doing...
    Further this process should be tested carefully and regularly to get familiar with it and to assure your backup/restore/recovery procedure works properly.

    The table we want to recover is called test.test from our backup full_dump.sql.gz. As a first step we have to do the recovery with the following command to our test database:

    shell> zcat full_dump.sql.gz | extract_table.py --database=test --table=test | mysql -u root

    The script extract_table.py is part of the FromDual Recovery Manager to extract one single table from a mysqldump backup.

    As a next step we have to extract the binary log file and its position where to start recovery from out of our dump:

    shell> zcat full_dump.sql.gz | head -n 25 | grep CHANGE CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000216', MASTER_LOG_POS=1300976;

    Then we have to find out where we want to stop our Point-in-Time-Recovery. The need for recover is possibly due to a TRUCATE TABLE command or similar operation executed on the wrong system or it is just a time somebody has indicated us to recover to. The position to stop we can find as follows:

    shell> mysqlbinlog -v mysql-bin.000216 | grep -B5 TRUNCATE --color #150123 19:53:14 server id 35622 end_log_pos 1302950 CRC32 0x24471494 Xid = 3803 COMMIT/*!*/; # at 1302950 #150123 19:53:14 server id 35622 end_log_pos 1303036 CRC32 0xf9ac63a6 Query thread_id=54 exec_time=0 error_code=0 SET TIMESTAMP=1422039194/*!*/; TRUNCATE TABLE test

    And as a last step we have to apply all the changes from the binary log to our testing database:

    shell> mysqlbinlog --disable-log-bin --database=test --start-position=1300976 --stop-position=1302950 mysql-bin.000216 | mysql -u root --force

    Now the table test.test is recovered to the wanted point in time and we can dump and restore it to its final location back to the production database.

    shell> mysqldump --user=root --host=testing test test | mysql --user=root --host=production test

    This process has been tested on MySQL 5.1.73, 5.5.38, 5.6.22 and 5.7.5 and also on MariaDB 10.0.10 and 10.1.0.

    Introducing Myself: Jörg Brühe

    Jörg Brühe - Mon, 2015-01-19 20:56
    For some time already, FromDual's "Our Team" page lists me, and it even reveals that I joined in September, 2014. Also for some time, the list of FromDual blogs contains an entry "Jörg's Blog", but it doesn't lead to any entries. It is high time to fix this and create entries, starting with an introduction of myself.

    Often, in such introductions people use the phrase of "the new kid on the block". I won't. If I am to use those words, I will arrange them as "the kid on the new block". The reason is that I don't feel as a new kid in the MySQL village (or is it a city?), let alone in DBMS country.

    Ever since I left university (Technical University of Berlin, Germany), I have been involved in SQL DBMS development. After my previous product's team had been dissolved in Berlin and maintenance moved to Riga, Latvia, as a cost-cutting measure, I joined MySQL AB in 2004 as a member of the Build Team. Some of you will remember my name from MySQL release announcements, others may have seen it in bug reports, newsgroup entries, or other MySQL communication. Together with the other MySQL colleagues, I got acquired by Sun in 2008 and then by Oracle in 2010.

    In 2012, I decided I wanted a culture change, from a highly (and very strictly) organized corporation to a smaller, more flexible entity. That brought me to an internet site, where I worked as a DBA, concentrating on their (more than 100) MySQL installations. It was very interesting to view MySQL from the DBA side now, I got a big load of new experiences and learned a lot about using, running, and automating MySQL (rather than building and testing it).

    That might have continued, but as usual the world is changing, and my DBA job was no exception: Company strategy was to shift those tasks more to the various application teams and to do without a central point for MySQL. It is a sign how far MySQL knowledge has spread, and how little effort may be sufficient to run MySQL, that this worked. OTOH, this invalidated the assumptions on which my job was based, and I did not really fit the alternatives left in that company.

    In that situation, my contact to FromDual got renewed, and both the company and me felt we should be a good match. Till now, there is no indication we were wrong: I like the work, I feel I'm productive and make customers happy, the tasks are manifold and interesting (so interesting that writing the first blog gets delayed quite long), and I expect this will continue.

    "Ad multos annos!"

    Impacts of max_allowed_packet size problems on your MySQL database

    Shinguz - Sun, 2015-01-18 11:18
    Taxonomy upgrade extras: max_allowed_packetconnectionBackupRestoredump

    We recently run into some troubles with max_allowed_packet size problems during backups with the FromDual Backup/Recovery Manager and thus I investigated a bit more in the symptoms of such problems.

    Read more about: max_allowed_packet.

    A general rule for max_allowed_packet size to avoid problems is: All clients and the server should have set the same value for max_allowed_packet size!

    I prepared some data for the test which looked as follows:

    mysql> SELECT id, LEFT(data, 30), LENGTH(data), ts FROM test; +----+--------------------------------+--------------+------+ | id | left(data, 30) | length(data) | ts | +----+--------------------------------+--------------+------+ | 1 | Anhang | 6 | NULL | | 2 | Anhang | 6 | NULL | | 3 | Anhangblablablablablablablabla | 2400006 | NULL | | 4 | Anhang | 6 | NULL | +----+--------------------------------+--------------+------+

    Max_packet_size was set to a too small value then:

    mysql> SHOW GLOBAL VARIABLES WHERE variable_name = 'max_allowed_packet'; +--------------------+---------+ | Variable_name | Value | +--------------------+---------+ | max_allowed_packet | 1048576 | +--------------------+---------+

    The first test was to retrieve the too big row:

    mysql> SELECT * FROM test WHERE id = 3; ERROR 2020 (HY000): Got packet bigger than 'max_allowed_packet' bytes mysql> SELECT CURRENT_USER(); ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 6 Current database: test

    We got an error message AND we were disconnected from the server. This is indicated with the message MySQL server has gone away which is basically wrong. We were disconnected and not the server has died or similar in this case.

    A further symptom is that we get an entry in the MySQL error log about this incident:

    [Warning] Aborted connection 3 to db: 'test' user: 'root' host: 'localhost' (Got an error writing communication packets)

    So watching carefully such error messages in your MySQL error log with the script check_error_log_mysql.pl from our Nagios/Icinga plugins would be a good idea...

    The mysqldump utility basically does the same as a SELECT command so I tried this out and got the same error:

    shell> mysqldump -u root test > /tmp/test_dump.sql mysqldump: Error 2020: Got packet bigger than 'max_allowed_packet' bytes when dumping table `test` at row: 2

    And again we get an error message in the error log! This is also a good indicator to see if your backup, made with mysqldump failed in this case.

    To get a proper dump we have to configure the mysqldump utility properly:

    shell> mysqldump --max-allowed-packet=5000000 -u root test > /tmp/test_dump.sql

    After the backup we tried to restore the data:

    shell> mysql -u root test < /tmp/test_dump.sql ERROR 2006 (HY000) at line 40: MySQL server has gone away

    Again we got an error on the command line and in the MySQL error log:

    [Warning] Aborted connection 11 to db: 'test' user: 'root' host: 'localhost' (Got a packet bigger than 'max_allowed_packet' bytes)

    and further the data are only partially loaded:

    mysql> SELECT * FROM test; +----+--------+------+ | id | data | ts | +----+--------+------+ | 1 | Angang | NULL | | 2 | Angang | NULL | +----+--------+------+

    Another symptom we can see here is that the MySQL status aborted_clients is increased in all 3 situation:

    mysql> SHOW GLOBAL STATUS WHERE variable_name = 'aborted_clients'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | Aborted_clients | 10 | +-----------------+-------+

    One positive aspect is that with MySQL 5.7.5 the first 2 symptoms do not appear any more...

    Further information you can find here: Communication Errors and Aborted Connections.

    Avoid temporary disk tables with MySQL

    Shinguz - Fri, 2014-12-19 07:38
    Taxonomy upgrade extras: temporary tablediskselectquery tuning

    For processing SELECT queries MySQL needs some times the help of temporary tables. These temporary tables can be created either in memory or on disk.

    The number of creations of such temporary tables can be found with the following command:

    mysql> SHOW GLOBAL STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 4 | | Created_tmp_tables | 36 | +-------------------------+-------+
    There are 2 different reasons why MySQL is creating a temporary disk table instead of a temporary memory table:
    • The result is bigger than the smaller one of the MySQL variables max_heap_table_size and tmp_table_size.
    • The result contains columns of type BLOB or TEXT.
    In the following example we can see how the temporary disk table can be avoided without changing the column types: mysql> CREATE TABLE test ( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY , data TEXT , type TINYINT UNSIGNED ); mysql> INSERT INTO test VALUES (NULL, 'State is green', 1), (NULL, 'State is green', 1) , (NULL, 'State is red', 3), (NULL, 'State is red', 3) , (NULL, 'State is red', 3), (NULL, 'State is orange', 2); mysql> EXPLAIN SELECT data, COUNT(*) FROM test GROUP BY data; +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ | 1 | SIMPLE | test | ALL | NULL | NULL | NULL | NULL | 6 | Using temporary; Using filesort | +----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+ mysql> SHOW SESSION STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 0 | | Created_tmp_tables | 3 | +-------------------------+-------+ mysql> SELECT data, COUNT(*) FROM test GROUP BY data; +-----------------+----------+ | data | count(*) | +-----------------+----------+ | State is green | 2 | | State is orange | 1 | | State is red | 3 | +-----------------+----------+ mysql> SHOW SESSION STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 1 | | Created_tmp_tables | 4 | +-------------------------+-------+ mysql> SELECT SUBSTR(data, 1, 32), COUNT(*) FROM test GROUP BY SUBSTR(data, 1, 32); mysql> SHOW SESSION STATUS LIKE 'created_tmp%tables'; +-------------------------+-------+ | Variable_name | Value | +-------------------------+-------+ | Created_tmp_disk_tables | 1 | | Created_tmp_tables | 5 | +-------------------------+-------+

    This method can be used if changing the table structure from TEXT to VARCHAR or the use of a RAM disk are not possible solutions.

    Making HAProxy High Available for MySQL Galera Cluster

    Shinguz - Sun, 2014-12-14 18:37
    Taxonomy upgrade extras: HAProxyload balancerGalera ClusterVIPvirtual IPHigh Availabilityha

    After properly installing and testing a Galera Cluster we see that the set-up is not finished yet. It needs something in front of the Galera Cluster that balances the load over all nodes.
    So we install a load balancer in front of the Galera Cluster. Typically nowadays HAProxy is chosen for this purpose. But then we find, that the whole Galera Cluster is still not high available in case the load balancer fails or dies. So we need a second load balancer for high availability.
    But how should we properly fail-over when the HAProxy load balancer dies? For this purpose we put a Virtual IP (VIP) in front of the HAProxy load balancer pair. The Virtual IP is controlled and fail-overed with Keepalived.

    Installation of HAProxy and Keepalived

    First some preparations: For installing socat we need the repoforge repository:

    shell> cd /tmp shell> wget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm shell> yum localinstall rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm shell> yum update shell> yum install socat

    Then we can start installing HAProxy and Keepalived:

    shell> yum install haproxy keepalived shell> chkconfig haproxy on shell> chkconfig keepalived on

    We can check the installed HAProxy and Keepalived versions as follows:

    shell> haproxy -v HA-Proxy version 1.5.2 2014/07/12 shell> keepalived --version Keepalived v1.2.13 (10/15,2014)
    Configuration of HAProxy

    More details you can find in the HAProxy documentation.

    shell> cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak shell> cat << _EOF >/etc/haproxy/haproxy.cfg # # /etc/haproxy/haproxy.cfg # #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 1020 # See also: ulimit -n user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats.sock mode 600 level admin stats timeout 2m #--------------------------------------------------------------------- # common defaults that all the 'frontend' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode tcp log global option dontlognull option redispatch retries 3 timeout queue 45s timeout connect 5s timeout client 1m timeout server 1m timeout check 10s maxconn 1020 #--------------------------------------------------------------------- # HAProxy statistics backend #--------------------------------------------------------------------- listen haproxy-monitoring *:80 mode http stats enable stats show-legends stats refresh 5s stats uri / stats realm Haproxy\ Statistics stats auth monitor:AdMiN123 stats admin if TRUE frontend haproxy1 # change on 2nd HAProxy bind *:3306 default_backend galera-cluster backend galera-cluster balance roundrobin server nodeA 192.168.1.61:5201 maxconn 151 check server nodeB 192.168.1.61:5202 maxconn 151 check server nodeC 192.168.1.61:5203 maxconn 151 check _EOF
    Starting and testing HAProxy

    The HAProxy can be started as follows:

    shell> service haproxy start

    and then be checked either over the socket:

    shell> socat /var/lib/haproxy/stats.sock readline prompt > show info > show stat > help

    or over your favourite web browser entering the username and password (monitor:AdMiN123) specified in the configuration file above:

    To check the application over the load balancer we can run the following command:

    shell> mysql --user=app --password=secret --host=192.168.1.38 --port=3306 --exec="SELECT @@wsrep_node_name;" +-------------------+ | @@wsrep_node_name | +-------------------+ | Node C | +-------------------+ shell> mysql --user=app --password=secret --host=192.168.1.38 --port=3306 --exec="SELECT @@wsrep_node_name;" +-------------------+ | @@wsrep_node_name | +-------------------+ | Node A | +-------------------+ shell> mysql --user=app --password=secret --host=192.168.1.38 --port=3306 --exec="SELECT @@wsrep_node_name;" +-------------------+ | @@wsrep_node_name | +-------------------+ | Node B | +-------------------+
    Configuration a Virtual IP (VIP) with Keepalived

    Now we have 2 HAProxy load balancers. But what happens if one of them fails. Then we do not want to reconfigure our application to work properly again. The fail-over should happen automatically. For this we need a Virtual IP which should automatically fail-over.

    More details you can find in the Keepalived documentation and the keepalived user guide.

    shell> cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak cat << _EOF >/etc/keepalived/keepalived.conf # # /etc/keepalived/keepalived.conf # global_defs { notification_email { remote-dba@fromdual.com root@localhost } # Change email from on lb2: notification_email_from lb1@haproxy1 router_id HAPROXY } vrrp_script chk_haproxy { script "killall -0 haproxy" interval 2 weight 2 } vrrp_instance GALERA_VIP { interface eth1 virtual_router_id 42 # Higher priority on other node priority 101 # 102 advert_int 1 # notify "/usr/local/bin/VRRP-notification.sh" virtual_ipaddress { 192.168.1.99/32 dev eth1 } track_script { chk_haproxy } authentication { auth_type PASS auth_pass secret } } _EOF
    Starting and testing Keepalived

    To test the keepalived we can run the following command:

    shell> keepalived -f /etc/keepalived/keepalived.conf --dont-fork --log-console --log-detail ^C

    To finally start it the following command will serve:

    shell> service keepalived start

    To check the Virtual IP the following command will help:

    shell> ip addr show eth1

    And then we can check our application over the VIP:

    shell> mysql --user=app --password=secret --host=192.168.1.99 --port=3306 --exec="SELECT @@wsrep_node_name;"
    Literature

    failed MySQL DDL commands and Galera replication

    Shinguz - Tue, 2014-12-09 15:45
    Taxonomy upgrade extras: galerareplicationDDLTOIRSU

    We have recently seen a case where the following command was executed on a Galera Cluster node:

    SQL> GRANT SUPER ON userdb.* TO root@127.0.0.111; ERROR 1221 (HY000): Incorrect usage of DB GRANT and GLOBAL PRIVILEGES
    2014-12-09 14:53:55 7457 [Warning] Did not write failed 'GRANT SUPER ON userdb.* TO root@127.0.0.111' into binary log while granting/revoking privileges in databases. 2014-12-09 14:53:55 7457 [ERROR] Slave SQL: Error 'Incorrect usage of DB GRANT and GLOBAL PRIVILEGES' on query. Default database: ''. Query: 'GRANT SUPER ON userdb.* TO root@127.0.0.111', Error_code: 1221 2014-12-09 14:53:55 7457 [Warning] WSREP: RBR event 1 Query apply warning: 1, 17 2014-12-09 14:53:55 7457 [Warning] WSREP: Ignoring error for TO isolated action: source: c5e54ef5-7faa-11e4-97b0-5e5c695f08a5 version: 3 local: 0 state: APPLYING flags: 65 conn_id: 4 trx_id: -1 seqnos (l: 4, g: 17, s: 15, d: 15, ts: 113215863294782)

    According to the error message it looks like this command is done in Total Order Isolation (TOI) mode during the Rolling Schema Upgrade (RSU).

    Only on the nodes which did NOT receive this wrong command the error log message was written and further they have received a GRA_*.log file.

    Analysis of the GRA_*.log (failed transactions) files:

    hexdump -C GRA_2_16.log 00000000 f3 fe 86 54 02 53 14 00 00 76 00 00 00 76 00 00 |...T.S...v...v..| 00000010 00 00 00 04 00 00 00 00 00 00 00 00 00 00 2a 00 |..............*.| 00000020 00 00 00 00 00 01 00 00 00 40 00 00 00 00 06 03 |.........@......| 00000030 73 74 64 04 21 00 21 00 08 00 0b 04 72 6f 6f 74 |std.!.!.....root| 00000040 09 6c 6f 63 61 6c 68 6f 73 74 00 67 72 61 6e 74 |.localhost.grant| 00000050 20 53 55 50 45 52 20 6f 6e 20 75 73 65 72 64 62 | SUPER on userdb| 00000060 2e 2a 20 74 6f 20 72 6f 6f 74 40 31 32 37 2e 30 |.* to root@127.0| 00000070 2e 30 2e 31 31 31 |.0.111 |
    dd if=bin-log.000001 of=binlog.header bs=1 count=120 cat binlog.header GRA_2_17.log > GRA_2_17.binlog_events mysqlbinlog GRA_2_17.binlog_events ... # at 120 #141209 15:04:54 server id 5201 end_log_pos 118 CRC32 0x3432312e Query thread_id=45 exec_time=0 error_code=0 SET TIMESTAMP=1418133894/*!*/; SET @@session.pseudo_thread_id=4/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1073741824/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; grant SUPER on userdb.* to root@127.0.0.111 /*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

    It further looks like this command was issues by Connection ID number 4: conn_id: 4.

    How to recover deleted tablespace?

    Abdel-Mawla Gharieb - Fri, 2014-11-14 22:56

    Sometimes, MySQL tablespace file(s) might be deleted by mistake, e.g. delete the shared tablespace (ibdata1) or an individual tablespace (table_name.ibd).

    In this post I will show you how to recover those files (on Linux OS) having only one condition, MySQL service should still be running. If MySQL service stopped after deleting that file, this method will not work, so it is extremely important to act as quick as possible to avoid data loss.

    The following is a simple table creation (innodb_file_per_table is enabled) and the records count inside that table:

    SQL> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=23 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 22 | +----------+ 1 row in set (0.02 sec)

    Now, lets delete the individual tablespace for that table:

    shell> rm -rf /var/lib/mysql/test/t.ibd

    At this time, we can still select and modify that table!!

    SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec)

    To be more accurate, rm does not actually delete the file, rather it removes the directory entry pointing to the file's inode. The inode - and in consequence the file - will be removed only if this is the last reference, but as long as the MySQL server process has the file opened, there is another reference which is the open file handle (that's why the "mysqld" server process must still be running).

    To list the opened files we can use the Linux command lsof (we filter the output to get only the deleted tablespace information):

    shell> lsof |grep t.ibd COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME mysqld 11401 mysql 25uW REG 7,0 98304 1010691 /var/lib/mysql/test/t.ibd (deleted)

    The file has a tag of (deleted) which means that the directory entry pointing to the file's inode was deleted but there is another reference(s) to that inode, otherwise it won't be listed by the above command.
    Now the question is, how can we get the on-disk path to that opened file if the directory entry was removed?

    We can use the "/proc" interface to the running processes and their file handles by the following formula:

    • File path = /proc/PID/fd/FD-number

    According to the above formula and using the output of the "lsof" command, the file we just deleted is located here:

    shell> ll /proc/11401/fd/25 lrwx------ 1 mysql mysql 64 Oct 28 16:14 /proc/11401/fd/25 -> /var/lib/mysql/test/t.ibd (deleted)

    To make sure that this is the on-disk path for the file we deleted, check the reference: it still points to the original path.

    How can we recover that file??
    • First, we should make sure that no other queries are modifying that table: SQL> LOCK TABLE t READ; Query OK, 0 rows affected (0.00 sec)
    • Then we copy the data blocks (/proc/11401/fd/25) to a new file (we use the original file path) and change the ownership to the MySQL system user (mysql): shell> cp /proc/11401/fd/25 /var/lib/mysql/test/t.ibd shell> chown mysql:mysql /var/lib/mysql/test/t.ibd
    • Restart MySQL service (if we didn't restart MySQL service directly after recovering the tablespace all changes on that table will still be redirected to the open file handle not the just recovered copy and thus will be lost after the restart): shell> service mysql restart ..... SUCCESS! ..... SUCCESS!
    • The tablespace is now recovered and we can modify the table normally: SQL> SELECT COUNT(*) FROM t; +----------+ | COUNT(*) | +----------+ | 23 | +----------+ 1 row in set (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> INSERT INTO t VALUES (NULL); Query OK, 1 row affected (0.00 sec) SQL> select COUNT(*) from t; +----------+ | COUNT(*) | +----------+ | 25 | +----------+ 1 row in set (0.00 sec)

    Notes:

    • We can use the same procedures above to recover the shared tablespace (ibdata1) but we should lock all tables before the recovery process by using the SQL command "FLUSH TABLES WITH READ LOCK;"
    • If the MySQL server had to deal with more files (.frm, .ibd, .MYI, .MYD, ...) than its "open_file_limit", it might happen that it will close this handle. In that case, the file will also cease to exist, even though the process is still running and that's why it is extremely important to act as quick as possible.
    • The same procedure can be used to recover MyISAM files (.MYI and .MYD) but note that the file handle will be released if "FLUSH TABLES;" SQL command was executed.
    • The same procedure can be used as well to recover binary logs, general logs, ... etc but note that the file handle will be released if "FLUSH LOGS;" SQL command was executed.
    • This method can be used to recover any deleted file on Linux not only MySQL files but if the inode files have other references (lsof).

    Real life case:

    One of our customers was enabling the general query log on his production system, he noticed that the file was continuously growing and to not to consume the available free disk space on his server he removed that file by "rm /path/to/general_query.log". However, the available free space was still being consumed while he couldn't see the general log file. The customer thought that the file was deleted but in fact, the file handle was still opened by MySQL server process.
    To get the problem solved we only issued the SQL command "FLUSH LOGS;" - which the customer should have issued after removing the file - then the file handle was closed, thus the inode was deleted and the consumed disk space freed back to the system.

    Things you should consider before using GTID

    Abdel-Mawla Gharieb - Fri, 2014-11-14 16:50

    Global Transaction ID (GTID) is one of the major features that were introduced in MySQL 5.6 which provides a lot of benefits (I have talked about the GTID concept, implementation and possible troubleshooting at Percona Live London 2014, you can download the slides from our presentations repository or from my session at Percona Live.
    On the other hand, there are some important things you should consider before deploying GTID in production, I'm going to list them here in this blog post.

    Table of Content Migration to GTID replication

    It is required to shutdown MySQL service on all servers in the replication setup in order to perform the migration from classic replication (based on binary logs information) to the transaction-based (GTID) replication which means that the migration process requires downtime.

    The online migration to GTID replication is not yet available.
    Facebook and Booking.com provided some MySQL patches for this, but they are not yet contained in Oracle's binaries.
    So, if you can't afford a downtime during the migration process, then you might not be able to make the change.

    Non transactionally safe statement will raise errors now

    It is required to enable the system variable (enforce_gtid_consistency) on all servers inside the GTID replication setup which prevent executing the non transactionally safe statements (check GTID restrictions) like:

    • CREATE TABLE .. SELECT.
    • CREATE TEMPORARY TABLE (inside a transaction).
    • Statements that update non-transactional tables inside a transaction.

    So, you will have to fix your application first if it contains any of the above statements before using GTID replication.

    MySQL Performance in GTID

    It is required to enable the variables (bin_log and log_slave_updates) on - at least - the slave servers which affects the performance on those slaves negatively.

    So, the performance should be tested very well before the production migration to GTID replication.

    mysql_upgrade script

    The mysql_upgrade script's problem when executed on a server having gtid_mode=on has been fixed since MySQL 5.6.7, but it is still not recommended to execute mysql_upgrade when gtid_mode=on as it might change system tables that is using MyISAM, which is non transactional.

    Errant transactions! Transactions which are executed on a slave apart from the replication transactions (i.e. not executed on the master) are called "Errant transactions", those transactions cause trouble if that slave later is promoted to be a new master in the a fail-over process.
    Once the other slaves connect to the new master, they send the value of gtid_executed and the master in turn checks those values, compares it with its own gtid_executed set and sends back all missing transactions (the errant transactions) to the slaves which leads to one of the following problems:
    • If those transactions still exist in the new master's binary log files, they will be replicated to the other slave which was not intentional when those were executed only on the slave (new master).
    • If those transactions do no longer exist in the new master's binary log file, the replication will break on all slaves.
    How to avoid such problem?
    • Choose some slaves to be possible candidates for promotion in case of fail-over. Thus, stand alone transactions (which are not coming from the master) should NOT be executed there.
    • Use one of the MySQL utilities (mysqlfailover or mysqlrpladmin) to find out if there are any errant transactions on the slave before the promotion or not.
    Filtration on the slave

    In some cases we might need to make filtration to the replication on the slave(s) i.e. not all tables' or databases' changes are propagated to the slave by using the system variables (replicate_ignore_db or replicate_ignore_table). When the slave receives transactions from the master which modify those ignored tables or databases, it simply skips executing them and when the slave restarted it sends the gtid_excuted to the master and the master finds the missing transactions (those for the ignored tables or databases) and sends them back to the slave.

    Again, that leads to one of the following two conditions:

    • If those transactions still exist in the master's binary log files, then no problem as the slave will skip executing them again!!
    • If those transactions are no longer there in the master's binary log files, the replication will break on the slave.

    Well, the above problem is supposed to be fixed in MySQL 5.6.18 (Bug #70048) The fix is injecting empty transactions on the slave for those ones modifying ignored tables or databases instead of just skipping them, and when the slave restarted they won't be sent back again from the master.

    I listed the above problem although it should be fixed now because I want to mention that having MySQL always updated to the recent release is a good practice to avoid such problems and to get the most bug fixes.

    Conclusion:

    The following are the main things which should be considered before using GTID:

    • Migration from classic replication to transaction-based (GTID) replication requires downtime.
    • Non-transactionally safe statements will not be executed in GTID replication.
    • MySQL performance is a little bit slower in GTID replication, especially, on the slaves.
    • mysql_upgrade script might cause troubles on a server having GTID_MODE=ON and it should be tested first.
    • Errant transactions might break the replication in the fail-over process, thus, planning slaves for promotion will avoid falling in such cases.
    • Some GTID bugs are fixed now (like slave filtration issue), thus MySQL server should be updated to the latest version once there is a new release.
    • New bugs are expected to be discovered, so the application should be tested very well with GTID before performing the migration on production.

    Pages

    Subscribe to FromDual - MySQL, Galera Cluster, MariaDB and Percona Server support, SLA and services aggregator - FromDual TechFeed (en)