Rails Data Migration Task

24 Apr 2020

Rails database migration annoyance

Sometimes we may want to do some data update when doing data migration

class UpdateUserData < ActiveRecord::Migration[6.0]
  def up
    User.find_each do |user|
      user.update(some_attr: 'attr content')
    end
  end

  def down
  end
end

But sometimes:

  • The codebase may cause database migration failed when you rerun from scratch
  • Sometimes we just want to get rid of it
  • Sometimes the codebase is too complicated, we want to write test for them

But I found a solution to fix them and still can run them with database migration flow

In brief: put those code outside the database migration files.

Let’s do it

Create data_migration folder

mkdir lib/data_migration

Create four files: data_migration.rb, base.rb, emtpy_migrate.rb, get_task_calss.rb

# lib/data_migration.rb
class DataMigration
  class << self
    def execute(migration_filename)
      new(migration_filename).execute
    end
  end

  private

  attr_reader :migration_filename

  public

  def initialize(migration_filename)
    @migration_filename = migration_filename
  end

  def execute
    task_class.execute(migration_filename)
  end

  private

  def task_class
    GetTaskClass.call(migration_filename)
  end
end

# lib/data_migration/get_task_class.rb
class DataMigration
  module GetTaskClass
    def call(migration_filename)
      "DataMigration::DM#{migration_filename.camelize}".constantize
    rescue NameError
      EmptyMigrate
    end

    module_function :call
  end
end

# lib/data_migration/base.rb
class DataMigration
  class Base

    class << self
      def execute(migration_filename)
        task = new(migration_filename)
        task.execute
        task.report
      end
    end

    attr_reader :migration_filename

    def initialize(migration_filename)
      @migration_filename = migration_filename
    end

    def execute
      raise MethodNeedOverride.new("DataMigration::Base#execute hasn't been override.")
    end

    def report
      Rails.info("Finish a data migration task for #{migration_filename}")
    end
  end
end

# lib/data_migration/empty_migrate.rb
class DataMigration
  class EmptyMigrate < Base
    def execute
      # Do nothing
    end

    def report
      Rails.info("Don't have a data migration task for #{migration_filename}")
    end
  end
end

Now we have build the basic data migration structure. Imagine when we use rails g migration update_user_data, we get a file like this:

# db/migrate/20200425104872_update_user_data.rb
class UpdateUserData < ActiveRecord::Migration[6.0]
  def change
  end
end

Then we update it:

# db/migrate/20200425104872_update_user_data.rb
class UpdateUserData < ActiveRecord::Migration[6.0]
  def up
    DataMigration.execute('20200425104872_update_user_data')
  end

  def down
  end
end

And create a new file called dm20200425104872_update_user_data.rb:

# lib/data_migration/dm20200425104872_update_user_data.rb
class DataMigration
  class DM20200425104872UpdateUserData < Base
    def execute
      User.find_each do |user|
        user.update(some_attr: 'attr content')
      end
    end
  end
end

Then we can run data migration with bin/rails db:migrate. That’s it.

You may ask me, what we got. We got two advantage:

  • You can write unit tests for DataMigration::DM20200425104872UpdateUserData very easy.
  • You can delete this file whenever you want. DataMigration will run DataMigration::EmptyTask when it can’t find the target task.
Back to top