You are here

Shrinking InnoDB system tablespace file ibdata1 PoC

In this weeks MySQL workshop we were discussing, beside other things, about the innodb_file_per_table parameter and its advantages of enabling it. In addition there was a discussion if the InnoDB system tablespace file can be shrinked once it has been grown very large or not. We all know the answer: The InnoDB system tablespace file does never shrink again.

But why should it not be possible? Other databases like for example Oracle can shrink or even get rid of tablespace files... After some philosophising about it we came to the conclusion that we should give it a try if this is possible with InnoDB as well.

The scenario we considered was the following: You inherit a MySQL database with InnoDB tables but innodb_file_per_table was set to 0. So all the tables are located in the InnoDB tablespace file. And only a small amount of space is left on the device and there is a lot of free space in the InnoDB system tablespace file. The database itself is much too big to dump and restore and we want to get rid of the one big InnoDB system tablespace file and have many small tablespace files as we get them with innodb_file_per_table = 1.

So what we did is the following: We created InnoDB tables inside the InnoDB system tablespace (ibdata1) and bloat them up. Then we altered them to be placed in their own tablespace files by OPTIMIZE TABLE. And now the tricky part starts: How can we shrink the InnoDB system tablespace file to free the disk space again?

CAUTION: This is a prove of concept and should never be used on a production system!!!

First we move all tables out of the InnoDB system tablespace (with innodb_file_per_table = 1):

mysqlcheck --optimize --all-databases --user=root
...
note     : Table does not support optimize, doing recreate + analyze instead
status   : OK
...

Now all tables have been moved out of the system tablespace, but the file is still about 674 Mbyte in size:

ll ibdata1
-rw-rw----. 1 mysql mysql 706740224 Dec  6 23:37 ibdata1

Then we search for empty blocks at the end of the InnoDB data files:

innochecksum -v -d ibdata1

file ibdata1 = 706740224 bytes (43136 pages)...
checking pages in range 0 to 43135
page 0: log sequence number: first = 3558400819; second = 3558400819
page 0: old style: calculated = 148443420; recorded = 148443420
page 0: new style: calculated = 4252778336; recorded = 4252778336
...
page 42508: log sequence number: first = 0; second = 0
page 42508: old style: calculated = 1371122432; recorded = 0
page 42508: new style: calculated = 1575996416; recorded = 0
...
page 43135: log sequence number: first = 0; second = 0
page 43135: old style: calculated = 1371122432; recorded = 0
page 43135: new style: calculated = 1575996416; recorded = 0

In ideal case we should also find blocks which are not used any more but not blanked out. Theses 627 blocks (of 16k = 10 Mbyte) can easily be removed...

Next we shrink the InnoDB system tablespace file after stopping the mysqld:

printf '' | dd of=ibdata1 bs=16384 seek=42508
ll ibdata1
-rw-rw----. 1 mysql mysql 696451072 Dec  6 23:42 ibdata1

As a next step we have to change the number of blocks in the header of the InnoDB system tablespace file. This can be done with a tool like hexedit (aptitude install hexedit). We have to change at position 0x0030 the value from 43136 (0xA880) to 42508 (0xA60C):

hexdump -C -n 256 ibdata1
00000000  fd 7c 3f 60 00 00 00 00  00 00 00 00 00 00 00 00  |.|?`............|
00000010  00 00 00 00 d4 18 e3 33  00 08 00 00 00 00 d4 18  |.......3........|
00000020  e4 13 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  a8 80 00 00 a6 c0 00 00  00 00 00 00 01 21 00 00  |.............!..|

Otherwise we would get an error like:

InnoDB: Error: tablespace size stored in header is 43146 pages, but
InnoDB: the sum of data file sizes is only 42508 pages

It looks like InnoDB itself corrects somehow the block number to a 0x100 boundary (4 Mbyte) later.

As the next step we have to fix the new style check sum (at position 0x0000) and the old style check sum (at position 0x3FFC). You have to do this until innochecksum does not complain anymore:

innochecksum -d -p 0 ibdata1
file ibdata1 = 696451072 bytes (42508 pages)...
checking pages in range 0 to 0
page 0: log sequence number: first = 3558400819; second = 3558400819
page 0: old style: calculated = 2354503790; recorded = 2354503790
page 0: new style: calculated = 3427457314; recorded = 3587772574

When you have done this the database should be ready to start.

The tables later on can be possibly transferred with the transportable tablespace feature which comes with MySQL 5.6.

I have not found a good way yet to find the highest used block in the tablespace file. So it is a wild guess which is dangerous. Especially because some InnoDB UNDO LOG blocks seems to be located there at very high positions:

SELECT page_type, MAX(page_number) AS max_page_number
  FROM information_schema.innodb_buffer_page
 WHERE space = 0
   AND page_number != 0
 GROUP BY page_type
 ORDER BY max_page_number;

+-------------------+-----------------+
| page_type         | max_page_number |
+-------------------+-----------------+
| TRX_SYSTEM        |               5 |
| SYSTEM            |             300 |
| BLOB              |            9366 |
| EXTENT_DESCRIPTOR |           32768 |
| IBUF_BITMAP       |           32769 |
| INODE             |           42123 |
| INDEX             |           45229 |
| ALLOCATED         |           45247 |
| UNDO_LOG          |           45503 |
+-------------------+-----------------+

It would be good if we have a method to relocate those blocks somehow...

To verify that everything works I have tried to increase the system tablespace again. This seems to work if the number of blocks is dividable by 256 (4 Mbyte, or 128 2 Mbyte?). But growing the system tablespace again should not be the intention.

Further according to our tests this method of shrinking the InnoDB system tablespace seems to work with MySQL 5.1, 5.5 and 5.6.

Thanks to Ralf, Torsten and Stefan for assistance!

It would be nice to get some feedback from the InnoDB and Percona guys about how this feature could be implemented correctly...

And finally: Do not blame and beat me. I know that this is an evil hack, but I like to play in my sandbox as I want!

Taxonomy upgrade extras: 

Comments

The I_S.INNODB_BUFFER_PAGE query seems unsafe to me, as it queries the buffer pool and not the tablespace. The innochecksum approach to find unused pages at the end to truncate + header field patching seems to be as OK as it could possibly be given the method. It seems that implementing the truncate of unused tablespace trailer would not be that hard (say, in XtraDB), and compacting with page relocation should be possible too with more effort. I wonder if it's on InnoDB roadmap.
Laurynascomment

Hi, Don't work for me :( file ibdata1 = 144539648 bytes (8822 pages)... checking pages in range 0 to 8821 page 0: log sequence number: first = 579975309; second = 579975309 page 0: old style: calculated = 1043303352; recorded = 1043303352 page 0: new style: calculated = 2013654343; recorded = 3190797841 page 0 invalid (fails new style checksum) Question: The hex value A60C: A6 in 0x0030 and 0C in 0x0031 ? Thanks,
csmaniotocomment

Hello csmanioto,

As mentioned above this is a PoC and should never be used on system with critical data because it is an evil hack!!!

If you would like to see this comming in production soon you should vote fore this feature request.

Shinguzcomment