The Myth of Down Migrations; Introducing Atlas Migrate Down
TL;DR
Ever since my first job as a junior engineer, the seniors on my team told me that whenever I make a schema change I must write the corresponding "down migration", so it can be reverted at a later time if needed. But what if that advice, while well-intentioned, deserves a second look?
Today, I want to argue that contrary to popular belief, down migration files are actually a bad idea and should be actively avoided.
In the final section, I'll introduce an alternative that may sound completely contradictory: the new migrate down
command. I will explain the
thought process behind its creation and show examples of how to use it.
Background
Since the beginning of my career, I have worked in teams where, whenever it came to database migrations,
we were writing "down files" (ending with the .down.sql
file extension). This was considered good practice and
an example of how a "well-organized project should be."
Over the years, as my career shifted to focus mainly on infrastructure and database tooling in large software projects (at companies like Meta), I had the opportunity to question this practice and the reasoning behind it.
Down migrations were an odd thing. In my entire career, working on projects with thousands of down files, I never applied them on a real environment. As simple as that: not even once.
Furthermore, since we have started Atlas and to this very day, we have interviewed countless software engineers from virtually every industry. In all of these interviews, we have only met with a single team that routinely applied down files in production (and even they were not happy with how it worked).
Why is that? Why is it that down files are so popular, yet so rarely used? Let's dive in.
Down migrations are the naively optimistic plan for a grim and unexpected world
Down migrations are supposed to be the "undo" counterpart of the "up" migration. Why do "undo" buttons exist? Because mistakes happen, things fail, and then we want a way to quickly and safely revert them. Database migrations are considered something we should do with caution, they are super risky! So, it makes sense to have a plan for reverting them, right?
But consider this: when we write a down file, we are essentially writing a script that will be executed in the future to revert the changes we are about to make. This script is written before the changes are applied, and it is based on the assumption that the changes will be applied correctly. But what if they are not?
When do we need to revert a migration? When it fails. But if it fails, it means that the database might be in an unknown state. It is quite likely that the database is not in the state that the down file expects it to be. For example, if the "up" migration was supposed to add two columns, the down file would be written to remove these two columns. But what if the migration was partially applied and only one column was added? Running the down file would fail, and we would be stuck in an unknown state.