With the release of v0.6.0, we are happy to announce official support for a style of workflow for managing changes to database schemas that we have been experimenting with in the past months: Versioned Migration Authoring.
TL;DR
- Atlas supports a declarative workflow (similar to Terraform) where users provide the
desired database schema in a simple data definition language
and Atlas calculates a plan to get a target database to that state. This workflow
is supported by the
schema apply
command. - Many teams prefer a more imperative approach where each change to the database schema is checked-in to source control and reviewed during code-review. This type of workflow is commonly called versioned migrations (or change based migrations) and is supported by many established tools such as Flyway and Liquibase.
- The downside of the versioned migration approach is, of course, that it puts the burden of planning the migration on developers. As part of the Atlas project we advocate for a third combined approach that we call "Versioned Migration Authoring".
- Versioned Migration Authoring is an attempt to combine the simplicity and expressiveness of the declarative approach with the control and explicitness of versioned migrations.
- To use Versioned Migration Authoring today, use the
atlas migrate diff
command. See the Getting Started section below for instructions.
Declarative Migrations
The declarative approach has become increasingly popular with engineers nowadays because it embodies a convenient separation of concerns between application and infrastructure engineers. Application engineers describe what (the desired state) they need to happen, and infrastructure engineers build tools that plan and execute ways to get to that state (how). This division of labor allows for great efficiencies as it abstracts away the complicated inner workings of infrastructure behind a simple, easy to understand API for the application developers and allows for specialization and development of expertise to pay off for the infra people.
With declarative migrations, the desired state of the database schema is given as input to the migration engine, which plans and executes a set of actions to change the database to its desired state.
For example, suppose your application uses a small SQLite database to store its data.
In this database, you have a users
table with this structure:
schema "main" {}
table "users" {
schema = schema.main
column "id" {
type = int
}
column "greeting" {
type = text
}
}
Now, suppose that you want to add a default value of "shalom"
to the greeting
column. Many developers are not aware that it isn't possible to modify a column's
default value in an existing table in SQLite. Instead, the common practice is to
create a new table, copy the existing rows into the new table and drop the old one
after. Using the declarative approach, developers can change the default value for
the greeting
column:
schema "main" {}
table "users" {
schema = schema.main
column "id" {
type = int
}
column "greeting" {
type = text
default = "shalom"
}
}
And have Atlas's engine devise a plan similar to this:
-- Planned Changes:
-- Create "new_users" table
CREATE TABLE `new_users` (`id` int NOT NULL, `greeting` text NOT NULL DEFAULT 'shalom')
-- Copy rows from old table "users" to new temporary table "new_users"
INSERT INTO `new_users` (`id`, `greeting`) SELECT `id`, IFNULL(`greeting`, 'shalom') AS `greeting` FROM `users`
-- Drop "users" table after copying rows
DROP TABLE `users`
-- Rename temporary table "new_users" to "users"
ALTER TABLE `new_users` RENAME TO `users`
Versioned Migrations
As the database is one of the most critical components in any system, applying changes to its schema is rightfully considered a dangerous operation. For this reason, many teams prefer a more imperative approach where each change to the database schema is checked-in to source control and reviewed during code-review. Each such change is called a "migration", as it migrates the database schema from the previous version to the next. To support this kind of requirement, many popular database schema management tools such as Flyway, Liquibase or golang-migrate support a workflow that is commonly called "versioned migrations".
In addition to the higher level of control which is provided by versioned migrations, applications are often deployed to multiple remote environments at once. These environments are not controlled (or even accessible) by the development team. In such cases, declarative migrations, which rely on a network connection to the target database and on human approval of migrations plans in real-time, are not a feasible strategy.
With versioned migrations (sometimes called "change-based migrations"), instead of describing the desired state ("what the database should look like"), developers describe the changes themselves ("how to reach the state"). Most of the time, this is done by creating a set of SQL files containing the statements needed. Each of the files is assigned a unique version and a description of the changes. Tools like the ones mentioned earlier are then able to interpret the migration files and to apply (some of) them in the correct order to transition to the desired database structure.
The benefit of the versioned migrations approach is that it is explicit: engineers know exactly what queries are going to be run against the database when the time comes to execute them. Because changes are planned ahead of time, migration authors can control precisely how to reach the desired schema. If we consider a migration as a plan to get from state A to state B, oftentimes multiple paths exist, each with a very different impact on the database. To demonstrate, consider an initial state which contains a table with two columns:
CREATE TABLE users (
id int,
name varchar(255)
);
Suppose our desired state is:
CREATE TABLE users (
id int,
user_name varchar(255)
);
There are at least two ways get from the initial to the desired state:
- Drop the
name
column and create a newuser_name
column. - Alter the name of the
name
column touser_name
.
Depending on the context, either may be the desired outcome for the developer planning the change. With versioned migrations, engineers have the ultimate confidence of what change is going to happen, which may not be known ahead of time in a declarative approach.
Migration Authoring
The downside of the versioned migration approach is, of course, that it puts the burden of planning the migration on developers. This requires a certain level of expertise that is not always available to every engineer, as we demonstrated in our example of setting a default value in a SQLite database above.
As part of the Atlas project we advocate for a third combined approach that we call "Versioned Migration Authoring". Versioned Migration Authoring is an attempt to combine the simplicity and expressiveness of the declarative approach with the control and explicitness of versioned migrations.
With versioned migration authoring, users still declare their desired state and use the Atlas engine to plan a safe migration from the existing to the new state. However, instead of coupling planning and execution, plans are instead written into normal migration files which can be checked-in to source control, fine-tuned manually and reviewed in regular code review processes.
Getting started
Start by downloading the Atlas CLI:
- macOS + Linux
- Homebrew
- Docker
- Windows
- Manual Installation
To download and install the latest release of the Atlas CLI, simply run the following in your terminal:
curl -sSf https://atlasgo.sh | sh
Get the latest release with Homebrew:
brew install ariga/tap/atlas
To pull the Atlas image and run it as a Docker container:
docker pull arigaio/atlas
docker run --rm arigaio/atlas --help
If the container needs access to the host network or a local directory, use the --net=host
flag and mount the desired
directory:
docker run --rm --net=host \
-v $(pwd)/migrations:/migrations \
arigaio/atlas migrate apply
--url "mysql://root:pass@:3306/test"
Download the latest release and move the atlas binary to a file location on your system PATH.
Next, define a simple Atlas schema with one table and an empty migration directory:
schema "test" {}
table "users" {
schema = schema.test
column "id" {
type = int
}
}
Let's run atlas migrate diff
with the necessary parameters to generate a migration script for
creating our users
table:
--dir
the URL to the migration directory, by default it isfile://migrations
.--to
the URL of the desired state, an HCL file or a database connection.--dev-url
a URL to a Dev Database that will be used to compute the diff.
atlas migrate diff create_users \
--dir="file://migrations" \
--to="file://schema.hcl" \
--dev-url="mysql://root:pass@:3306/test"
Observe that two files were created in the migrations
directory:
- 20220811074144_create_users.sql
- atlas.sum
By default, migration files are named with the following format {{ now }}_{{ name }}.sql
.
If you wish to use a different file format, use the --dir-format
option.
-- create "users" table
CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
In addition to the migration directory, Atlas maintains a file name atlas.sum
which is used
to ensure the integrity of the migration directory and force developers to deal with situations
where migration order or contents were modified after the fact.
h1:t1fEP1rSsGf1gYrYCjsGyEyuM0cnhATlq93B7h8uXxY=
20220811074144_create_users.sql h1:liZcCBbAn/HyBTqBAEVar9fJNKPTb2Eq+rEKZeCFC9M=
Further reading
To learn more about Versioned Migration Authoring:
- Read the docs
- CLI Command Reference
Have questions? Feedback? Find our team on our Discord server.