Skip to main content

The "dirty secret" of golang-migrate

· 10 min read
Noa Rogoszinski
Noa Rogoszinski
DevRel Engineer

Editor's note: Dear Reader, please accept my sincere apologies for the blatant dad joke in the title. Being a father of two, I couldn't resist the pun. -RT

Atlas was originally created to support Ent, a popular Go ORM. From the start, Ent shipped with a simple "auto-migration" feature that could set up the database schema based on the Ent schema. However, as the project grew popular, it became clear that a more robust, versioned migration system was needed.

Our team did not originally set out to build a new migration tool; Ent's authors had hoped to add functionality to generate migration files based on the existing "auto-migration" engine and use an off-the-shelf migration tool to apply them. The most promising candidate was golang-migrate, a widely adopted migration tool in the Go community which was renowned for its simplicity and wide database support.

But like many tools that start simple and grow popular, we realized that golang-migrate begins to show its limitations as projects - and teams - scale. In this post, we'll review some of the issues we encountered with golang-migrate and how they ultimately led us to build Atlas.

The "dirty" state

When migrations are done properly, using golang-migrate is straightforward: write your SQL files, apply them in order, and you're done.

But it's unrealistic to expect that no mistakes will be made, and if something goes wrong mid-migration — suppose you made a simple typo — you will see an error like this:

error: migration failed in line 0: create table t1 (
id int primary key,
); (details: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 3)

After fixing the error, you may intuitively rerun the migration, but now you're met with a new error:

error: Dirty database version 20250403065350. Fix and force version.

This "dirty" state means the last attempted migration didn't finish running and golang-migrate is now stuck. It won't apply future migrations until you manually resolve the issue and reset the version.

Isn't that what transactions are for?

If you've ever used a relational database, you might be asking yourself, why not just use transactions? We can start executing the migration statement, and if you hit an error, roll back the transaction!

With golang-migrate, however, migrations are not wrapped in transactions by default, even for databases that support transactions with DDL. Each file is applied independently. If one statement in the file fails, earlier statements remain applied and now the database is in an unknown limbo state.

How to fix "Dirty database version"

So how does one reset their database to apply their fixed migration? Let's go over the process using an example.

Let's say you create a migration that adds two new tables and sets up foreign key constraints:

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);

CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_fk VARCHAR(100) NOT NULL,
product_fk VARCHAR(255) NOT NULL
);

ALTER TABLE orders
ADD CONSTRAINT fk_orders_products
FOREIGN KEY (product_fk) REFERENCES products (name) ON DELETE CASCADE;

Trying to apply the migration:

migrate -source file://db/migrations -database "mysql://root:pass@tcp(localhost:3306)/example" up 1

An error occurs:

error: migration failed in line 0: CREATE TABLE products (
# ... redacted for brevity
(details: Error 6125: Failed to add the foreign key constraint. Missing unique key for constraint 'fk_orders_products' in the referenced table 'products')

Let's see what we need to do to fix this.

Step 0: Run a down migration

When creating migrations with golang-migrate, the user is prompted to write both "up" and "down" instructions in two separate files. The "up" file is meant for applying the desired changes and the "down" file is meant to roll back these changes. Since we should already have the down migration defined in this "down" file, let's try running it to get back where we started:

migrate \
-source file://db/migrations \
-database "mysql://root:pass@tcp(localhost:3306)/example" \
down 1

After checking the state of the database, golang-migrate will tell you:

error: Dirty database version 20250403065350. Fix and force version.

That's surprising. Our database is in the same version and still has the "dirty" tag. migrate down may have attempted to apply the down migration to the database, but since it's in a "dirty" state, golang-migrate will not make any further changes to it until it's fixed by the user.

Step 1: Plan a fix

Seeing that migrate expects us to make the necessary fixes, we now need to understand how to bring the database back to a clean state. To do this, we must first locate the error, note any statements that were applied before the failure, and plan a "fix" script to undo these changes.

The first place to look is the error message, which says the error occurred on line 0, but it also says the error occurred when trying to add a foreign key constraint to the orders table. We created a foreign key that references a column without a unique constraint. Thinking about it, without an index on this column, the database would need to perform a full table scan in order to maintain the integrity of the FK constraint, so makes sense why the migration failed here.

Looking at the migration file, we see that adding the constraint is the last statement in the migration, so we can safely assume that the first two statements were applied successfully. This means that if we want to roll back the database to the previous step, we need to reverse the statements:

DROP TABLE orders;

DROP TABLE products;

Step 2: Fix the database

With our fix script in hand, we can now run it against the database. But we still have one problem: migrate still has the database marked as "dirty" state.

So we need to run the fix script manually.

There are many ways to run a SQL script against a database. In any case, make sure that you have direct, privileged access to the production database, which you can use ad-hoc.

Step 3: Force version

After running the fix script, we can now run migrate force to tell golang-migrate that we are done fixing things and it can now mark the database as clean again:

migrate -database "mysql://root:pass@tcp(localhost:3306)/example" -path=. force <previous version>

Finally, we are back at the beginning.

Step 4: Fix the migration

Now let's fix our original migration to add the missing unique constraint to the products table:

CREATE TABLE products
(
id SERIAL PRIMARY KEY,
+ name VARCHAR(255) UNIQUE NOT NULL,
- name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);

Step 5: Rerun the migration

We can now run the migration again:

migrate -source file://db/migrations -database "mysql://root:pass@tcp(localhost:3306)/example" up 1

And this time it should work:

20250403134507/u init (56.108792ms)

Summary: Speed bumps with golang-migrate

The main advantages of golang-migrate are its simplicity and wide database support. The two are actually linked; by keeping the tool simple, it's to add support for new databases, some of which aren't even relational. As we mentioned above, when things work as expected, golang-migrate simplicity makes it a great tool to speed up schema migrations.

However, when things go wrong (which can be quite common with migrations), golang-migrate can actually slow down the migration process. Here are some of the main issues we see:

  • Fragile - The tool enters a "dirty" state when encountering any form of failure and cannot automatically recover from partial failures using rollbacks.
  • Expensive Manual Fixes - The tool requires manual intervention to fix the database state. In our example, we had a migration that was relatively simple and easy to revert, so one can only imagine the implications of a migration failure on a much larger project.
  • Requires Privileged Access - A consequence of the above is that in order to solve day-to-day issues with migrations, you need direct access to the production database and are expected to run one-off, uncommitted code against it to revert changes and force the database into a "clean" state. This is a big no-no in most organizations, and for good reason.

A different approach: Atlas

Having experienced these pitfalls, we wanted to build a tool that was more resilient to migration failures and better suited for modern development practices. We wanted to create something that could handle the complexities of large teams, distributed development, and automated deployments.

To bake this into the tool, we took a different, more involved approach to migrations. It was harder to build, and it sure makes it more difficult to add support for new databases, but we think it was worth it. Here are some of the key features of Atlas that help users in more ways than just applying the migrations:

  • Automatic Rollbacks - For databases that support transactional DDL (like PostgreSQL), Atlas will automatically roll back failed migrations.
  • Statement-Level Tracking - Atlas executes migrations one statement at a time and tracks their progress. If a migration partially fails, Atlas knows where it left off and can resume from the last successful statement. No need to roll back or rerun everything from scratch.
  • Dynamic Down Migrations - Atlas' migrate down command does not rely on pre-computed down migrations. Instead, it computes the down migration dynamically based on the current state of the database. This enables Atlas users to easily recover from partial failures without needing to "break glass" and run a manual fix script.

To learn more about Atlas's error handling and recovery features, check out our blog post about troubleshooting migrations and the Atlas Docs.

The Atlas experience

Atlas was built to be a developer-friendly migration tool based on modern DevOps practices. Beyond being a migration tool, Atlas offers a more involved experience to our users to help ease the schema migration process that can get more complicated as projects and teams grow:

  • Schema as Code - Atlas allows you to define your database schema in a declarative way, using HCL, SQL or your favorite ORM.
  • Automatic Planning - Atlas automatically plans migrations based on the current state of the database and the desired state. By calculating the diff between the two, Atlas takes away the need to manually write migration scripts.
  • Automatic Code Review - Atlas automatically lints and tests migrations before applying them. This helps catch errors early and ensures that migrations are safe to apply.

Wrapping up

golang-migrate is great when everything goes smoothly, but when things break, it often leaves you with more work and lost time.

Atlas takes a different approach: transactional safety, statement-level tracking, and dynamic rollbacks help you recover from failures gracefully and keep moving forward. It's built for modern teams, CI/CD pipelines, and driven developers who want to spend as little time debugging migrations as possible.

Check it out at atlasgo.io.


As always, we would love to hear your feedback and suggestions on our Discord server.

Update April 8, 2025: This post was revised to reflect its goals more accurately.