The "dirty secret" of golang-migrate
Editor's note: Dear Reader, please accept my sincere apologies for the blatant dad joke in the title. Being a father of two, I couldn't resist the pun. -RT
Atlas was originally created to support Ent, a popular Go ORM. From the start, Ent shipped with a simple "auto-migration" feature that could set up the database schema based on the Ent schema. However, as the project grew popular, it became clear that a more robust, versioned migration system was needed.
Our team did not originally set out to build a new migration tool; Ent's authors had hoped to add functionality to generate
migration files based on the existing "auto-migration" engine and use an off-the-shelf migration tool to apply them.
The most promising candidate was golang-migrate
, a widely adopted migration tool in the Go community which was
renowned for its simplicity and wide database support.
But like many tools that start simple and grow popular, we realized that golang-migrate
begins to show its limitations as projects -
and teams - scale. In this post, we'll review some of the issues we encountered with golang-migrate
and how they
ultimately led us to build Atlas.
The "dirty" state
When migrations are done properly, using golang-migrate
is straightforward: write your SQL files, apply them in order, and you're done.
But it's unrealistic to expect that no mistakes will be made, and if something goes wrong mid-migration — suppose you made a simple typo — you will see an error like this:
error: migration failed in line 0: create table t1 (
id int primary key,
); (details: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 3)
After fixing the error, you may intuitively rerun the migration, but now you're met with a new error:
error: Dirty database version 20250403065350. Fix and force version.
This "dirty" state means the last attempted migration didn't finish running and golang-migrate
is now stuck. It won't apply future
migrations until you manually resolve the issue and reset the version.
If you've ever used a relational database, you might be asking yourself, why not just use transactions? We can start executing the migration statement, and if you hit an error, roll back the transaction!
With golang-migrate
, however, migrations are not wrapped in transactions by default, even for databases that support transactions with DDL.
Each file is applied independently. If one statement in the file fails, earlier statements remain applied and
now the database is in an unknown limbo state.
How to fix "Dirty database version"
So how does one reset their database to apply their fixed migration? Let's go over the process using an example.
Let's say you create a migration that adds two new tables and sets up foreign key constraints:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_fk VARCHAR(100) NOT NULL,
product_fk VARCHAR(255) NOT NULL
);
ALTER TABLE orders
ADD CONSTRAINT fk_orders_products
FOREIGN KEY (product_fk) REFERENCES products (name) ON DELETE CASCADE;
Trying to apply the migration:
migrate -source file://db/migrations -database "mysql://root:pass@tcp(localhost:3306)/example" up 1
An error occurs:
error: migration failed in line 0: CREATE TABLE products (
# ... redacted for brevity
(details: Error 6125: Failed to add the foreign key constraint. Missing unique key for constraint 'fk_orders_products' in the referenced table 'products')
Let's see what we need to do to fix this.
Step 0: Run a down migration
When creating migrations with golang-migrate
, the user is prompted to write both "up" and "down" instructions in two separate files.
The "up" file is meant for applying the desired changes and the "down" file is meant to roll back these changes. Since we should already
have the down migration defined in this "down" file, let's try running it to get back where we started:
migrate \
-source file://db/migrations \
-database "mysql://root:pass@tcp(localhost:3306)/example" \
down 1
After checking the state of the database, golang-migrate
will tell you:
error: Dirty database version 20250403065350. Fix and force version.
That's surprising. Our database is in the same version and still has the "dirty" tag. migrate down
may have attempted to apply the
down migration to the database, but since it's in a "dirty" state, golang-migrate
will not make any further changes to it until it's
fixed by the user.
Step 1: Plan a fix
Seeing that migrate
expects us to make the necessary fixes, we now need to understand
how to bring the database back to a clean state. To do this, we must first locate the error, note any statements that
were applied before the failure, and plan a "fix" script to undo these changes.
The first place to look is the error message, which says the error occurred on line 0, but it also says the error occurred when trying to add a
foreign key constraint to the orders
table. We created a foreign key that references a column without a unique constraint. Thinking about it,
without an index on this column, the database would need to perform a full table scan in order to maintain the integrity of
the FK constraint, so makes sense why the migration failed here.
Looking at the migration file, we see that adding the constraint is the last statement in the migration, so we can safely assume that the first two statements were applied successfully. This means that if we want to roll back the database to the previous step, we need to reverse the statements:
DROP TABLE orders;
DROP TABLE products;
Step 2: Fix the database
With our fix script in hand, we can now run it against the database. But we still have one problem: migrate
still has the database
marked as "dirty" state.
So we need to run the fix script manually.
There are many ways to run a SQL script against a database. In any case, make sure that you have direct, privileged access to the production database, which you can use ad-hoc.
Step 3: Force version
After running the fix script, we can now run migrate force
to tell golang-migrate
that we are done fixing things
and it can now mark the database as clean again:
migrate -database "mysql://root:pass@tcp(localhost:3306)/example" -path=. force <previous version>
Finally, we are back at the beginning.
Step 4: Fix the migration
Now let's fix our original migration to add the missing unique constraint to the products
table:
CREATE TABLE products
(
id SERIAL PRIMARY KEY,
+ name VARCHAR(255) UNIQUE NOT NULL,
- name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL DEFAULT 0
);
Step 5: Rerun the migration
We can now run the migration again:
migrate -source file://db/migrations -database "mysql://root:pass@tcp(localhost:3306)/example" up 1
And this time it should work:
20250403134507/u init (56.108792ms)
Summary: Speed bumps with golang-migrate
The main advantages of golang-migrate
are its simplicity and wide database support. The two are actually linked;
by keeping the tool simple, it's to add support for new databases, some of which aren't even relational. As we mentioned above,
when things work as expected, golang-migrate
simplicity makes it a great tool to speed up schema migrations.
However, when things go wrong (which can be quite common with migrations), golang-migrate
can actually slow down the migration process.
Here are some of the main issues we see:
- Fragile - The tool enters a "dirty" state when encountering any form of failure and cannot automatically recover from partial failures using rollbacks.
- Expensive Manual Fixes - The tool requires manual intervention to fix the database state. In our example, we had a migration that was relatively simple and easy to revert, so one can only imagine the implications of a migration failure on a much larger project.
- Requires Privileged Access - A consequence of the above is that in order to solve day-to-day issues with migrations, you need direct access to the production database and are expected to run one-off, uncommitted code against it to revert changes and force the database into a "clean" state. This is a big no-no in most organizations, and for good reason.
A different approach: Atlas
Having experienced these pitfalls, we wanted to build a tool that was more resilient to migration failures and better suited for modern development practices. We wanted to create something that could handle the complexities of large teams, distributed development, and automated deployments.
To bake this into the tool, we took a different, more involved approach to migrations. It was harder to build, and it sure makes it more difficult to add support for new databases, but we think it was worth it. Here are some of the key features of Atlas that help users in more ways than just applying the migrations:
- Automatic Rollbacks - For databases that support transactional DDL (like PostgreSQL), Atlas will automatically roll back failed migrations.
- Statement-Level Tracking - Atlas executes migrations one statement at a time and tracks their progress. If a migration partially fails, Atlas knows where it left off and can resume from the last successful statement. No need to roll back or rerun everything from scratch.
- Dynamic Down Migrations - Atlas'
migrate down
command does not rely on pre-computed down migrations. Instead, it computes the down migration dynamically based on the current state of the database. This enables Atlas users to easily recover from partial failures without needing to "break glass" and run a manual fix script.
To learn more about Atlas's error handling and recovery features, check out our blog post about troubleshooting migrations and the Atlas Docs.
The Atlas experience
Atlas was built to be a developer-friendly migration tool based on modern DevOps practices. Beyond being a migration tool, Atlas offers a more involved experience to our users to help ease the schema migration process that can get more complicated as projects and teams grow:
- Schema as Code - Atlas allows you to define your database schema in a declarative way, using HCL, SQL or your favorite ORM.
- Automatic Planning - Atlas automatically plans migrations based on the current state of the database and the desired state. By calculating the diff between the two, Atlas takes away the need to manually write migration scripts.
- Automatic Code Review - Atlas automatically lints and tests migrations before applying them. This helps catch errors early and ensures that migrations are safe to apply.
Wrapping up
golang-migrate
is great when everything goes smoothly, but when things break, it often leaves you with more work and lost time.
Atlas takes a different approach: transactional safety, statement-level tracking, and dynamic rollbacks help you recover from failures gracefully and keep moving forward. It's built for modern teams, CI/CD pipelines, and driven developers who want to spend as little time debugging migrations as possible.
Check it out at atlasgo.io.
As always, we would love to hear your feedback and suggestions on our Discord server.
Update April 8, 2025: This post was revised to reflect its goals more accurately.