You are here

Does InnoDB data compression help with short disk space?

Taxonomy upgrade extras: 

Because we are a bit short off disk space on one of our servers I had the idea to try out the MySQL feature Data Compression for InnoDB. This feature is useful if you have tables with VARCHAR, BLOB or TEXT attributes.

To not make it not too simple our table is partitioned as well. Our table looks like this:

CREATE TABLE `history_str` (
  `itemid` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) unsigned NOT NULL DEFAULT '0',
  `value` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`itemid`,`clock`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY RANGE (clock)
(PARTITION p2012_kw05 VALUES LESS THAN (1328482800) ENGINE = InnoDB,
 PARTITION p2012_kw06 VALUES LESS THAN (1329087600) ENGINE = InnoDB,
 PARTITION p2012_kw07 VALUES LESS THAN (1329692400) ENGINE = InnoDB,
 PARTITION p2012_kw08 VALUES LESS THAN (1330297200) ENGINE = InnoDB,
 PARTITION p2012_kw09 VALUES LESS THAN (1330902000) ENGINE = InnoDB,
 PARTITION p2012_kw10 VALUES LESS THAN (1331506800) ENGINE = InnoDB,
 PARTITION p2012_kw11 VALUES LESS THAN (1332111600) ENGINE = InnoDB,
 PARTITION p2012_kw12 VALUES LESS THAN MAXVALUE ENGINE = InnoDB);

And the partitions use the following space on disk (not really much, I know!):

-rw-rw---- 1 mysql mysql 184549376 Mar  7 00:43 history_str#P#p2012_kw05.ibd
-rw-rw---- 1 mysql mysql 209715200 Mar 14 00:11 history_str#P#p2012_kw06.ibd
-rw-rw---- 1 mysql mysql 234881024 Mar 21 00:47 history_str#P#p2012_kw07.ibd
-rw-rw---- 1 mysql mysql 226492416 Mar 23 16:39 history_str#P#p2012_kw08.ibd
-rw-rw---- 1 mysql mysql 234881024 Mar 19 18:22 history_str#P#p2012_kw09.ibd
-rw-rw---- 1 mysql mysql 289406976 Mar 19 18:22 history_str#P#p2012_kw10.ibd
-rw-rw---- 1 mysql mysql 281018368 Mar 23 16:39 history_str#P#p2012_kw11.ibd
-rw-rw---- 1 mysql mysql 213909504 Mar 23 17:23 history_str#P#p2012_kw12.ibd

After the table was compressed with the following values:

ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4

The space on disk was used as follows:

-rw-rw---- 1 mysql mysql   7340032 Mar 23 17:33 history_str#P#p2012_kw05.ibd
-rw-rw---- 1 mysql mysql   7340032 Mar 23 17:34 history_str#P#p2012_kw06.ibd
-rw-rw---- 1 mysql mysql   8388608 Mar 23 17:36 history_str#P#p2012_kw07.ibd
-rw-rw---- 1 mysql mysql  75497472 Mar 23 17:49 history_str#P#p2012_kw08.ibd
-rw-rw---- 1 mysql mysql 104857600 Mar 23 17:44 history_str#P#p2012_kw09.ibd
-rw-rw---- 1 mysql mysql 125829120 Mar 23 17:51 history_str#P#p2012_kw10.ibd
-rw-rw---- 1 mysql mysql 125829120 Mar 23 17:57 history_str#P#p2012_kw11.ibd
-rw-rw---- 1 mysql mysql 134217728 Mar 23 18:11 history_str#P#p2012_kw12.ibd

So we got a reduction of used disk space by 40 - 60%! Not too bad.

But we also want to see what impact it has on memory:

SHOW GLOBAL STATUS LIKE 'innodb_buffer%';
+---------------------------------------+----------------------+
| Variable_name                         | Value                |
+---------------------------------------+----------------------+
| Innodb_buffer_pool_pages_data         | 10769                |
| Innodb_buffer_pool_pages_dirty        | 6613                 |
| Innodb_buffer_pool_pages_free         | 644                  |
| Innodb_buffer_pool_pages_misc         | 18446744073709549802 |
| Innodb_buffer_pool_pages_total        | 9599                 |
+---------------------------------------+----------------------+

Those numbers do not really make sense. We also hit a known MySQL Bug #59550: Innodb_buffer_pool_pages_misc goes wrong.

Some fancy graphs

InnoDB Buffer Pool activity

compressed_tables1.png

Because our InnoDB Buffer pool was too big we have reduced it a bit. For enabling the Barracuda file format we restarted the database afterwards. An then the numbers went amok...

InnoDB compression time

compressed_tables2.png

The first time we can see InnoDB compression time in our Monitor... \o/

InnoDB Row operations

compressed_tables3.png

And here you can find out how we solved it technically... :-)

Comments

Hi, could you explain how InnoDB compression time is calculated ? Thx Cédric
cpeintrecomment

Hello Cédric, What the monitor does is the following:
SELECT SUM(compress_time) AS compress_time, SUM(uncompress_time) AS uncompress_time
  FROM information_schema.INNODB_CMP
The MySQL documentation states in chapter The INFORMATION_SCHEMA INNODB_CMP and INNODB_CMP_RESET Tables: COMPRESS_TIME: Total time in seconds spent in attempts to compress B-tree pages of the size PAGE_SIZE.
UNCOMPRESS_TIME: Total time in seconds spent in uncompressing B-tree pages of the size PAGE_SIZE.
So we store the delta and can see the seconds spent in compressing or uncompressing pages. And I see that these units do not make much sense here. I will fix this. If you have suggestions how to monitor these information in a better way we are happy to hear your opinion. Oli
olicomment

Hello, thx for these informations. I regret that these tables are not available in 5.1 with InnoDB plugin.
cpeintrecomment

Hello Cédric, OK. I see. 5.1 is a bit outdated now... Most of our customers are going to 5.5 now or are already on it. So we have to focus on the future. Oli
olicomment

I understand that you have to focus on the future, no problem. My last comment wasn't for you. But I'm testing compression in a 5.1 (with InnoDB plugin) now and these tables could help me. I will test that on a 5.5 very soon. Thx Cédric
cpeintrecomment