You are here

How MySQL behaves with many schemata, tables and partitions

Introduction

Recently a customer claimed that his queries were slow some times and sometimes they were fast.

First idea: Flipping query execution plan caused by InnoDB could be skipped because it affected mainly MyISAM tables.

Second idea: Caching effects by either the file system cache caching MyISAM data or the MyISAM key buffer caching MyISAM indexes were examined: File system cache was huge and MyISAM key buffer was only used up to 25%.

I was a bit puzzled...

Then we checked the table_open_cache and the table_definition_cache which were still set to their default values. Their according status information Opened_tables and Opened_table_definitions were clearly indicating that those caches are much too small...

The typical reaction is: increase open_files_limit (after increasing the operating system user limits) to a higher value. Unfortunately most of the Linux distributions have a default values for open files of 1024 which is far to low for typical database systems.

Too many open files

But customer claimed that he tried this already (set open_files_limit to 50'000) and got error messages in the MySQL error log:

# perror 24
OS error code  24:  Too many open files

So we were even more puzzled.

After some further investigation we found that the customer has up to 600 schemata and in each schema he had 30 to 100 tables and some of those tables have even monthly partitions up to 4 years (roughly 50 partitions).

A partition internally is handled similar to a table. So we have something between 18'000 and 3'000'000 tables/partitions in total. We wrote already about the problems with MySQL databases with many tables in Configuration of MySQL for Shared Hosting but now it is even worse with the partitions. I remembered that having too many partitions with MySQL is not a good idea so we investigated a bit deeper in this area:

We first looked at the amount of file handles MySQL has open:

lsof -p <pid> | wc -l

This value clearly moved towards our new open_files_limit of 150'000 (with table_open_cache and table_definition_cache set to 32k each) and then we got Too many open files errors. So we set table_open_cache and table_definition_cache both back to 2048 and go a stable number of file descriptors between 100'000 and 110'000.

This gives us an idea how to extrapolate those values when we want to have bigger caches. But we do not know how a Linux system or MySQL will behave with much higher values...

Partition table test

But why do we get such high values?

As an example we took the following table with 14 partitions:

CREATE TABLE ptn_test (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT
, data VARCHAR(64)
, ts TIMESTAMP
, PRIMARY KEY (id, ts)
, INDEX (ts)
) ENGINE = MyISAM
PARTITION BY RANGE ( UNIX_TIMESTAMP(ts) ) (
  PARTITION p_2010    VALUES LESS THAN ( UNIX_TIMESTAMP('2011-01-01 00:00:00') )
, PARTITION p_2011_01 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-02-01 00:00:00') )
, PARTITION p_2011_02 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-03-01 00:00:00') )
, PARTITION p_2011_03 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-04-01 00:00:00') )
, PARTITION p_2011_04 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-05-01 00:00:00') )
, PARTITION p_2011_05 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-06-01 00:00:00') )
, PARTITION p_2011_06 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-07-01 00:00:00') )
, PARTITION p_2011_07 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-08-01 00:00:00') )
, PARTITION p_2011_08 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-09-01 00:00:00') )
, PARTITION p_2011_09 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-10-01 00:00:00') )
, PARTITION p_2011_10 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-11-01 00:00:00') )
, PARTITION p_2011_11 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-12-01 00:00:00') )
, PARTITION p_2011_12 VALUES LESS THAN ( UNIX_TIMESTAMP('2012-01-01 00:00:00') )
, PARTITION p_max VALUES LESS THAN (MAXVALUE)
);

and inserted some rows in each partition:

INSERT INTO ptn_test VALUES
  (NULL, 'Bla', '2010-12-01 00:00:42')
, (NULL, 'Bla', '2011-01-01 00:00:42')
, (NULL, 'Bla', '2011-02-01 00:00:42')
, (NULL, 'Bla', '2011-03-01 00:00:42')
, (NULL, 'Bla', '2011-04-01 00:00:42')
, (NULL, 'Bla', '2011-05-01 00:00:42')
, (NULL, 'Bla', '2011-06-01 00:00:42')
, (NULL, 'Bla', '2011-07-01 00:00:42')
, (NULL, 'Bla', '2011-08-01 00:00:42')
, (NULL, 'Bla', '2011-09-01 00:00:42')
, (NULL, 'Bla', '2011-10-01 00:00:42')
, (NULL, 'Bla', '2011-11-01 00:00:42')
, (NULL, 'Bla', '2011-12-01 00:00:42')
, (NULL, 'Bla', '2012-01-01 00:00:42');

Just for running a simple EXPLAIN:

EXPLAIN PARTITIONS SELECT * FROM ptn_test WHERE ts = '2011-11-01 00:00:42';

+----+-------------+----------+------------+--------+---------------+------+---------+------+------+-------+
| id | select_type | table    | partitions | type   | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+----------+------------+--------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | ptn_test | p_2011_11  | system | ts            | NULL | NULL    | NULL |    1 |       |
+----+-------------+----------+------------+--------+---------------+------+---------+------+------+-------+

MySQL already opens 28 file descriptors and uses one table_open_cache and one table_definition_cache entry. When doing the same query in a second session on a second schema MySQL already has opened 56 file descriptors and 2 table_open_cache and 2 table_definition_cache entries. What the MySQL manual states about you can find here: How MySQL Opens and Closes Tables.

SHOW GLOBAL STATUS LIKE 'open_%s';

+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 56    |
| Open_table_definitions   | 2     |
| Open_tables              | 2     |
+--------------------------+-------+

By the way we found that the value of Open_files is pretty close to the result of the following Linux command:

lsof -p <pid> | grep '/var/lib/mysql' | grep -e 'MYD' -e 'MYI' -e 'ib' | wc -l

So Open_files is a good indication of how far we are away of open_files_limit. MySQL needs some more file descriptors for error log file, binary log file etc. But this is typically less than 10.

Now we have already found why we got the Too many open files error message.

But it does not explain yet why our MyISAM key buffer was so badly used.

MyISAM key buffer is wiped out

I have not found it mentioned in the documentation explicitly (please correct me when I am wrong) but it looks like when MySQL has a pressure on the table_open_cache and removes entries from it, it wipes out blocks from the MyISAM key buffer of those tables as well.

A little experiment for this:

  • Clean up table_open_cache and table_definition_cache and MyISAM key buffer:
FLUSH TABLES;
FLUSH STATUS;
SET GLOBAL key_buffer_size=1024*1024;
SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 0     |
| Open_table_definitions   | 0     |
| Open_tables              | 0     |
| Opened_files             | 0     |
| Opened_table_definitions | 0     |
| Opened_tables            | 0     |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 837   |
+-------------------+-------+

  • Run a query from 2 sessions in 2 different schemata:
EXPLAIN
SELECT COUNT(*) FROM (
  SELECT * FROM ptn_test
   WHERE ts >= '2011-11-01 00:00:00' and Ts < '2011-12-01 00:00:00'
  ) AS x;

  • Show the status information again:
SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 56    |
| Open_table_definitions   | 2     |
| Open_tables              | 2     |
| Opened_files             | 62    |
| Opened_table_definitions | 2     |
| Opened_tables            | 2     |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 504   |
+-------------------+-------+

We have 56 file descriptors (2 tables x 14 partitions x (data + index = 2)), 2 table_open_cache and table_definition_cache entries and 333 MyISAM key buffer blocks used.

  • Connect from a 3rd connection without the -A option:
SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 30    |
| Open_table_definitions   | 21    |
| Open_tables              | 16    |
| Opened_files             | 127   |
| Opened_table_definitions | 21    |
| Opened_tables            | 21    |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 669   |
+-------------------+-------+

26 file descriptors where released. We currently have 16 entries in the table_open_cache (its size was limited to 16) but in total 19 tables were opened. Why 65 files were opened for this operation is unclear.

The current amount of MyISAM key blocks is 168 which is roughly the half of the value before. So we can assume that one of our 2 partitioned table was closed and its key buffer entries were wiped out.

  • Run the query again
EXPLAIN
SELECT COUNT(*) FROM (
  SELECT * FROM ptn_test
   WHERE ts >= '2011-11-01 00:00:00' and Ts < '2011-12-01 00:00:00'
  ) AS x;
SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 58    |
| Open_table_definitions   | 21    |
| Open_tables              | 16    |
| Opened_files             | 156   |
| Opened_table_definitions | 21    |
| Opened_tables            | 22    |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 504   |
+-------------------+-------+

28 file descriptors were used again with 29 in total and one more table was opened. And our key buffer is back to 333 blocks used.

So it looks like when we have really pressure on the table_open_cache this also has an impact on the MyISAM key buffer.

How does it behave with InnoDB?

Nowadays InnoDB is much more often used than MyISAM. So let us have a look how is the impact on InnoDB tables:

  • Starting point values:
SHOW GLOBAL STATUS LIKE 'open_%s';
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_pages_%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 0     |
| Open_table_definitions   | 0     |
| Open_tables              | 0     |
| Opened_files             | 0     |
| Opened_table_definitions | 0     |
| Opened_tables            | 0     |
+--------------------------+-------+

+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_buffer_pool_pages_data    | 1012  |
| Innodb_buffer_pool_pages_free    | 1002  |
| Innodb_buffer_pool_pages_misc    | 34    |
| Innodb_buffer_pool_pages_total   | 2048  |
+----------------------------------+-------+

  • Run the query in 2 different connections on 2 different tables:
SHOW GLOBAL STATUS LIKE 'open_%s';
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_pages_%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 0     |
| Open_table_definitions   | 2     |
| Open_tables              | 2     |
| Opened_files             | 6     |
| Opened_table_definitions | 2     |
| Opened_tables            | 2     |
+--------------------------+-------+

+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_buffer_pool_pages_data    | 1012  |
| Innodb_buffer_pool_pages_free    | 1002  |
| Innodb_buffer_pool_pages_misc    | 34    |
| Innodb_buffer_pool_pages_total   | 2048  |
+----------------------------------+-------+

  • Use a new connection:
SHOW GLOBAL STATUS LIKE 'open_%s';
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_pages_%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 2     |
| Open_table_definitions   | 21    |
| Open_tables              | 16    |
| Opened_files             | 71    |
| Opened_table_definitions | 21    |
| Opened_tables            | 21    |
+--------------------------+-------+

+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_buffer_pool_pages_data    | 1012  |
| Innodb_buffer_pool_pages_free    | 1002  |
| Innodb_buffer_pool_pages_misc    | 34    |
| Innodb_buffer_pool_pages_total   | 2048  |
+----------------------------------+-------+

So the phenomena of wiping out data seems definitely not to happen with InnoDB tables. Further InnoDB seems to use much less file descriptors than MyISAM.

Conclusion

  • MyISAM uses a huge number of file descriptors. This comes especially true when using partitions.
  • This needs a significant increase of open_files_limit. Its impact is unknown.
  • Shortage of entries in the table_open_cache leads to wipe out tables from the table_open_cache.
  • This leads to wipe out of MyISAM key buffer blocks of the according tables which leads to slower queries when the table is requested next time.
  • With InnoDB this behavior is much more relaxed and problems should appear much later than with MyISAM.
  • Be careful using a significant number of (partitioned) tables with MyISAM. It can have a serious impact on the performance of your system.

Comments

Did you test with or without innodb_file_per table? Did you know about MySQL Bug #62224?
Daniël van Eedencomment

Since MySQL 5.1.43 timestamp is not allowed as partition key any more [ 1 ]. Thus we can do it as follows:

CREATE TABLE ptn_test (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT
, data VARCHAR(64)
, dt DATETIME
, PRIMARY KEY (id, dt)
, INDEX (dt)
) ENGINE = MyISAM
PARTITION BY RANGE ( TO_DAYS(dt) ) (
  PARTITION p_2010    VALUES LESS THAN ( TO_DAYS('2011-01-01 00:00:00') )
, PARTITION p_2011_01 VALUES LESS THAN ( TO_DAYS('2011-02-01 00:00:00') )
, PARTITION p_2011_02 VALUES LESS THAN ( TO_DAYS('2011-03-01 00:00:00') )
, PARTITION p_2011_03 VALUES LESS THAN ( TO_DAYS('2011-04-01 00:00:00') )
, PARTITION p_2011_04 VALUES LESS THAN ( TO_DAYS('2011-05-01 00:00:00') )
, PARTITION p_2011_05 VALUES LESS THAN ( TO_DAYS('2011-06-01 00:00:00') )
, PARTITION p_2011_06 VALUES LESS THAN ( TO_DAYS('2011-07-01 00:00:00') )
, PARTITION p_2011_07 VALUES LESS THAN ( TO_DAYS('2011-08-01 00:00:00') )
, PARTITION p_2011_08 VALUES LESS THAN ( TO_DAYS('2011-09-01 00:00:00') )
, PARTITION p_2011_09 VALUES LESS THAN ( TO_DAYS('2011-10-01 00:00:00') )
, PARTITION p_2011_10 VALUES LESS THAN ( TO_DAYS('2011-11-01 00:00:00') )
, PARTITION p_2011_11 VALUES LESS THAN ( TO_DAYS('2011-12-01 00:00:00') )
, PARTITION p_2011_12 VALUES LESS THAN ( TO_DAYS('2012-01-01 00:00:00') )
, PARTITION p_max VALUES LESS THAN (MAXVALUE)
);
olicomment