Adding Columns to large MySQL Tables Quickly
Suppose that you have a MySQL Database, and in that database you have a non-trivial table with more than a million records. If you’re using that table with your Rails application, you might at some point like to add some additional columns.
It’s tempting to just write the migration like:
class AddThreeColumnsToQuarks < ActiveRecord::Migration def self.up add_column :quarks, :arbitrary_field1, :integer add_column :quarks, :arbitrary_field2, :string add_column :quarks, :arbitrary_field3, :integer end def self.down remove_column :quarks, :arbitrary_field1 remove_column :quarks, :arbitrary_field2 remove_column :quarks, :arbitrary_field3 end end
Should you do that, you will find that although it works, MySQL will take a fantastic amount of time to add the column when you have a lot of rows. What ActiveRecord is doing is adding each column individually with an alter statement:
ALTER TABLE `quarks` ADD `arbitrary_field1` int(11) ALTER TABLE `quarks` ADD `arbitrary_field2` varchar(255) ALTER TABLE `quarks` ADD `arbitrary_field3` int(11)
Each one of those ALTER statements makes a new temporary table, copies records from your existing table into the new table, and then replaces the old table with the new table. Five thousand records in the database? Adding three columns will copy the DB three times. Fifteen thousand rows are copied.
One can make this better by combining the ALTERs into one statement (as long as the ALTER contains a single type of operation, such as ADD COLUMN). The copy of the data in the table still takes a while. A few million rows? You might be waiting tens of minutes.
A FASTER way of adding columns is to create your own new table, then SELECT all of the rows from the existing table into it. You can create the structure from the existing table, then modify the structure however you’d like, then SELECT in the data. MAKE SURE that you SELECT the information into the new table in the same order as the fields are defined. Here’s an example:
class AddThreeColumnsToQuarks < ActiveRecord::Migration def self.up sql = ActiveRecord::Base.connection() sql.execute "SET autocommit=0" sql.begin_db_transaction sql.execute("CREATE TABLE quarks_new LIKE quarks") add_column :quarks_new, :arbitrary_field1, :integer add_column :quarks_new, :arbitrary_field2, :string add_column :quarks_new, :arbitrary_field3, :integer sql.execute("INSERT INTO quarks_new SELECT *, NULL, NULL, NULL FROM quarks") rename_table :quarks, :quarks_old rename_table :quarks_new, :quarks sql.commit_db_transaction # don't forget to remove quarks_old someday end def self.down drop_table :quarks rename_table :quarks_old, :quarks end end
You can change the NULLs into whatever default values you’d like the new columns in the existing rows to have.
How much faster can this be? On one table in one of our databases, a single add_column was approximately 17 minutes. When we used this technique, we reduced the time to add a column to approximately 45 seconds. YMMV —however you’ll notice a big improvement.
What about the indices? The CREATE TABLE .. LIKE preserves column attributes and indices. For more information, see the MySQL on line manual.