How we Altered a MySQL Table with 50 Million Rows without Downtime using gh-ost

I am sure if you’ve googled your way to this article, you already know the problem but here you go..

The Problem

Our production database has some key tables. Business requirements have evolved and recently we needed to add a new column to a table with 50 million rows. We played around with other options, but since it was likely to become a column with many WHERE queries, we decided that we had to add the column.

Browntape is an eCommerce web app and that is a 24×7 business, so we can’t take much downtime. Few minutes is acceptable, but nothing in hours. Doing an ALTER will lock the table for a few hours – not acceptable.

On top of that, we are on Amazon RDS. That causes some more complications like no access to super user. MySql version 5.6 (don’t judge).

gh-ost and other options

One option was percona-online-schema-change which is pretty popular to do this kind of stuff. It basically makes a ghost table (with new schema), points a bunch of triggers from the original table to the ghost table. Once the replication is done, you are set.

Another option is an online schema change by Facebook which seems to be popular.

However, we used gh-ost for the following reasons:

Least learning curve – even though the documentation was not very beginner friendly, I was able to figure this out pretty quickly.
binlog not triggers – gh-ost uses the binlog which is the basis of all Mysql replication. That seemed a lot more robust way of achieving this.
Postpone cut over feature – when you are doing an alter on a key table on your production database, you are scared. You want control to check and double check everything. This feature lets you do that. More on that later.

Installing gh-ost on Ubuntu

Installing on Ubuntu was pretty straightforward. The github project has a link to a .deb file. Just grab the .deb file and install like this:

$wget https://github.com/github/gh-ost/releases/download/v1.1.0/gh-ost_1.1.0_amd64.deb
$sudo dpkg -i gh-ost_1.1.0_amd64.deb

Changes on RDS to be ready for gh-ost

For gh-ost to work on RDS, you have to enable binlogs and set the binlog format to rows.

These can be found in the parameter groups in RDS. You can check both of these conditions with the following commands.

show global variables like 'log_bin';
show global variables like 'binlog_format';

Dry Run of the Migration (noop)

gh-ost provides a way to do a mock dry run of the migration without actually running the migration. This is called noop. If you run the command without –execute, then it does not actually migrate the data but gives you any redflags.

Understanding the Important Switches in the gh-ost command

The gh-ost command comes with many switches. Most are self explanatory, but some need documentation. I found the documentation of the project sparse in this aspect, so adding this for the Internet.

Here’s what the command looks like as per the documentation:

gh-ost \
--user="gh-ost" \
--password="123456" \
--host=master.with.rbr.com \
--database="my_schema" \
--table="my_table" \
--verbose \
--alter="ADD COLUMN random int(11) null DEFAULT null" \
--max-load=Threads_running=25 \
--critical-load=Threads_running=1000 \
--chunk-size=1000 \
--throttle-control-replicas="myreplica.1.com,myreplica.2.com" \
--max-lag-millis=1500 \
--allow-on-master \
--switch-to-rbr \
--allow-master-master \
--cut-over=default \
--exact-rowcount \
--concurrent-rowcount \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
[--execute]

Let me not insult your intelligence by explaining the first 7 switches.

Here are explanations to some switches that I found useful:

--allow-on-master and --allow-master-master was required for running the migration without a replica. Our Mysql setup did not have replication and I did not want to introduce it. So, was to use it in a mode where the ghost table is formed in the master database itself. The first switch allows that and the second explicitly permits master to master replication.
--switch-to-rbr will not work on RDS as you do not have super privileges while connecting from a remote host. I had to change the binlog format to rows in the parameter groups and use the switch --assume-rbr.
--exact-row-count enforces that gh-ost should count the exact number of rows to be migrated (probably by SELECT count(*) FROM table) instead of estimating it using an EXPLAIN query.
--concurrent-rowcount is less understood, but probably means that the estimates presented by the tool while copying the rows will be more accurate.

Here are a few lines of the command’s output to give you an idea.

Copy: 32185000/49903876 64.5%; Applied: 0; Backlog: 0/1000; Time: 2h10m30s(total), 2h10m30s(copy); streamer: mysql-bin-changelog.700402:12100895; Lag: 0.02s, State: migrating; ETA: 1h11m50s

Copy: 32250000/49903876 64.6%; Applied: 0; Backlog: 0/1000; Time: 2h11m0s(total), 2h11m0s(copy); streamer: mysql-bin-changelog.700402:31779955; Lag: 0.02s, State: migrating; ETA: 1h11m42s

Copy: 32327000/49903876 64.8%; Applied: 0; Backlog: 0/1000; Time: 2h11m30s(total), 2h11m30s(copy); streamer: mysql-bin-changelog.700402:55073246; Lag: 0.02s, State: migrating; ETA: 1h11m29s

Copy: 32405000/49903876 64.9%; Applied: 0; Backlog: 0/1000; Time: 2h12m0s(total), 2h12m0s(copy); streamer: mysql-bin-changelog.700402:78400893; Lag: 0.01s, State: migrating; ETA: 1h11m16s

Copy: 32480000/49903876 65.1%; Applied: 0; Backlog: 0/1000; Time: 2h12m30s(total), 2h12m30s(copy); streamer: mysql-bin-changelog.700402:101370328; Lag: 0.02s, State: migrating; ETA: 1h11m4s

Copy: 32556000/49903876 65.2%; Applied: 0; Backlog: 0/1000; Time: 2h13m0s(total), 2h13m0s(copy); streamer: mysql-bin-changelog.700402:124312398; Lag: 0.02s, State: migrating; ETA: 1h10m52s

Being Extra Paranoid about my data

I understand that gh-ost is an established open source project with 8.8K stars. But, I am responsible for my customers’ data. I have to make sure things don’t go south. That brings me to my favourite switch in gh-ost.

--postpone-cut-over-flag-file allows you to control when the actual cut over to the ghost table happens.

Let me first explain the concept of “cut over”, by chronology:

gh-ost creates a ghost table of your original table without any data. This is called _original_table_gho
Applies the ALTER to this table (which is quick)
Then, uses the replication system native to MySql, ie. binlog, to populate this ghost table with all the records. It shows your the progress of that.
Once done, it can do the “cut over” where the ghost table is renamed with the name of the original table. The old table is now renamed _original_table_del which you can drop later.

Now, this step of cut over was very important for us.

We had to take a short down time during this, just to make sure that no new records cross with the cut over is happening.
We wanted to be super-sure of zero data loss at least of the most recent data.

So, the --postpone-cut-over-flag-file creates as file (as specified in the switch) when the gh-ost command is run. Now, once gh-ost copies all the data to the ghost table, it will show you the status in the output. You can then satisfy your paranoia about your data. When sure, just delete the file mentioned. gh-ost will run the cut over for you.

Zero Data Loss Check

Here’s how I quickly made sure during the downtime that all my 90 day data was safe in the ghost table. Might be a bit excessive, but I did a row by row check.

Basically, wrote a query which created an MD5 hash of the CONCAT of the few most important columns in my table. If the STRCMP of both these hashed was 0, then my data was safe. Here’s my query:

select o.id,
STRCMP( MD5(CONCAT(o.id,o.title, o.created, o.company_id, o.user_id, o.channel_id, o.warehouse_id, o.order_status)) ,
MD5(CONCAT(g.id,g.title, g.created, g.company_id, g.user_id, g.channel_id, g.warehouse_id, g.order_status)) ) as cmp
from original_table o, _original_table_gho g
WHERE o.id=g.id
AND o.created > '2021-01-01'
HAVING cmp=1
order by o.id desc limit 1000;

So that was our journey of migrating this large table.

It took 3.5 hours to migrate the table with about 50 million rows.

With only a 15 min downtime!

Explore Features of Browntape Software

How we Altered a MySQL Table with 50 Million Rows without Downtime using gh-ost

The Problem

gh-ost and other options

Installing gh-ost on Ubuntu

Changes on RDS to be ready for gh-ost

Dry Run of the Migration (noop)

Understanding the Important Switches in the gh-ost command

Being Extra Paranoid about my data

Zero Data Loss Check

Leave a Reply Cancel reply

COMPANY

SOLUTIONS

FEATURES

Chat With Us