Skip to main content

Announcing Atlas v0.10: Cloud Community Preview

· 4 min read
Ariel Mashraki
Building Atlas

It's been two months since the release of v0.9.0, so we figured it's about time to release a new version and share with you what we've accomplished so far, as well as what's to come in the upcoming weeks. Besides the many improvements and bug fixes in v0.10.0, we added two major features to Atlas that I want to share with you: schema loaders and the Community Preview of Atlas Cloud.

Schema Loaders

In our previous post, we discussed our motivation for developing an infrastructure to load desired states from external sources (not just SQL and HCL), and we highlighted the importance of schema loaders. Today, I'm happy to share that we've made significant progress on this front. We started by creating a schema loader for the Ent framework, and with the release of v0.10.0, Ent users can now use their ent/schema package as the desired state in all the different Atlas commands.

Using the new integration, users can compare an ent/schema package with any other state, apply it onto a database, generate migrations from it, and much more. Here are two examples:

atlas migrate diff create_users \
--dir "file://migrations" \
--to "ent://path/to/schema" \
--dev-url "sqlite://dev?mode=memory&_fk=1"

I'm really eager to see this initiative come to fruition because it has proven to work well for the Ent community. We are now ready to expand support for additional frameworks and languages. In the upcoming weeks, you can expect to see additional integrations, such as GORM, Sequelize, and more. With these new superpowers, users will be able to manage all of their database schemas using a single tool - Atlas!

Atlas Cloud Community Preview

We are also super thrilled to announce the Community Preview of Atlas Cloud! Atlas Cloud is a cloud-based service that provides teams with an end-to-end solution for managing database schema changes. As part of the Community Preview, we are offering a free "Community" plan for all users which you can use to manage up to 5 migration directories for your team or personal projects.

One important feature that was recently added to Atlas is the ability to connect remote migration directories stored in GitHub to Atlas Cloud. This new functionality empowers users to easily audit and view their migration history and get migration linting checks on their PRs, such as destructive or backwards incompatible changes detection.

Let's walk through a simple guide on how to set it up to a project with just a few clicks:

1. Login to atlasgo.cloud and create a new workspace (organization) for your projects:

2. Once created, go to /dirs/configure and connect your migration directory stored in GitHub to Atlas Cloud:

3. After connecting your directory, you'll see an extensive overview of your migration history and the schema it presents:

4. From this point on, every change made to the migration directory will be reflected in Atlas Cloud. But what about the changes themselves? Here's where the magic happens. Once a directory is connected, any pull request that modifies it will be automatically checked and reviewed by Atlas!

Let's create a sample migration change, open a pull request, and see it in action:

Wonderful! However, that's not all. There is another detailed and visualized report available in Atlas Cloud that has been specifically created for this CI run. Go to the migration directory page, click on the CI Runs button to check it out.

A big thanks to @giautm, @masseelch and @yonidavidson for building this feature for Atlas!

What next?

Well, this is just the beginning of Atlas Cloud! In the upcoming weeks, we will be rolling out several new major features that we have been working on lately, including schema drift detection, managed migration deployments, and much more. If any of these features sound interesting to you, please do not hesitate to contact us.

We would love to hear from you on our Discord server ❤️.

Announcing Atlas v0.9.0: SQL as a First-Class Citizen

· 6 min read
Ariel Mashraki
Building Atlas

For a long time, one of the most common feature requests we've been getting from our users is the ability to manage their desired "schema state" using SQL. This is understandable, using Atlas DDL (HCL) can feel unfamiliar to some users, especially those who have never worked with Terraform before. For this reason, we're excited to announce the release of Atlas v0.9.0, which now fully supports SQL.

Schema as Code (SaC)

Atlas applies the common IaC concept of declarative resource management to database schemas. With Atlas, users do not need to plan schema changes themselves. Instead of figuring out the correct SQL statements to update their database schemas, users provide to Atlas the schema definition that describe their desired state and Atlas generates a migration plan to move from the current state to the desired state defined by the schema.

Starting from v0.9.0, users can use SQL schema files (or a directory) containing CREATE and ALTER statements to describe their desired state. To demonstrate this, let's use this schema example with a single users table:

schema.sql
-- create table "users
CREATE TABLE users(
id int NOT NULL,
name varchar(100) NULL,
PRIMARY KEY(id)
);

Given this schema file, Atlas offers two workflows to update databases:

  • Declarative: Similar to Terraform, Atlas compares the current state of the database schema with the desired state defined by the SQL schema, and generates a migration plan to reach that state.
  • Versioned: Atlas compares the current state defined by the migrations directory to the desired state defined by the SQL schema, and writes a new migration script to the directory to update the database schema to the desired state.

In this blog post, we'll focus on explaining how SQL schemas can be used with the declarative workflow. For the sake of simplicity, let's assume we have an empty database that we want to apply the schema above to:

atlas schema apply \
--url "mysql://root:pass@localhost:3306/example" \
--to "file://schema.sql" \
--dev-url "docker://mysql/8/example"
FLAGS
  • --url - the database URL to apply the schema to.
  • --to - URLs describe the desired state: SQL or HCL schema definition, or a database URL.
  • --dev-url - a URL to a Dev Database that will be used to compute the diff.

Running the command above with the --auto-approve flag will apply the following changes:

-- Planned Changes:
-- Create "users" table
CREATE TABLE `users` (`id` int NOT NULL, `name` varchar NULL, PRIMARY KEY (`id`));

Hooray! We have successfully created the users table defined in our schema file. Let's inspect our database and ensure its schema was actually updated by the command above:

atlas schema inspect \
--url "mysql://root:pass@localhost:3306/example" \
--format "{{ sql . }}"

Excellent! As you can see, our database schema has been updated:

-- create "users" table
CREATE TABLE `users` (`id` int NOT NULL, `name` varchar NULL, PRIMARY KEY (`id`));

Now let's make our schema more interesting by adding a column to the users table and creating a blog_posts table with a foreign key that references users:

schema.sql
-- create table "users
CREATE TABLE users(
id int NOT NULL,
name varchar(100) NULL,
email varchar(50) NULL,
PRIMARY KEY(id)
);

-- create table "blog_posts"
CREATE TABLE blog_posts(
id int NOT NULL,
title varchar(100) NULL,
body text NULL,
author_id int NULL,
PRIMARY KEY(id),
CONSTRAINT author_fk FOREIGN KEY(author_id) REFERENCES users(id)
);

Next, executing atlas schema apply again will update the database schema with the following changes:

atlas schema apply
-- Planned Changes:
-- Add column "email" to table: "users"
ALTER TABLE `users` ADD COLUMN `email` varchar NULL;
-- Create "blog_posts" table
CREATE TABLE `blog_posts` (`title` varchar NULL, `body` text NULL, `author_id` int NULL, `id` int NOT NULL, PRIMARY KEY (`id`), CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `users` (`id`) ON UPDATE NO ACTION ON DELETE NO ACTION);

Boom! Atlas automatically calculates the difference between the current state of our database and the desired state defined by our schema file, and generates the necessary changes to migrate the database to the new state. We don't need to specify each individual migration – we simply tell Atlas what state we want the database to be in, and it handles the rest.

To see a full description of this generated migration plan, check out this diagram example in Atlas public playground:

Diff ERD

Diff SQL

Atlas Playground

As part of this version, we have released the Atlas playground where users can visualize their database schemas in an interactive way. Simply provide an SQL or HCL schema, or import one from an existing database, and in return get an ERD visualizing their entire data model.

Users can also compare between two schemas with the Schema Diff button, and get the SQL statements necessary to migrate from one schema to the other - give it try!

Blog ERD

A big thanks to @solomonme, @ronenlu and @masseelch for contributing this feature to Atlas!

Schema Loaders

What's next? In the near future, we plan to add an infrastructure for loading schemas from external sources. This will enable ORM maintainers to integrate with Atlas and provide their schema definitions as Atlas schemas. As as result, they can utilize the Atlas engine to diff schemas, plan and lint migrations, execute them on the databases, and more.

The first ORM to integrate with Atlas will be Ent. Using this integration, Ent users will be able to generate Atlas schemas or migrations for their Ent projects with a single command:

atlas migrate diff create_users \
--dir "file://migrations" \
--to "ent://path/to/schema" \
--dev-url "docker://<driver-name>"

Would you like to see other ORMs integrated with Atlas? Please, join our Discord server and let me know.

What next?

Have questions or feedback? Feel free to reach out on our Discord server.

Learning to solve migration issues in a realistic simulation game

· 2 min read
Rotem Tamir
Building Atlas

Learning new things

One of my favorite things about software engineering, is that it's a career of continuous learning. Every new project or challenge presents us with an opportunity to increase our knowledge and improve our skills. Different people learn in different ways, but for me, one of the most effective ways to learn something new has always been by doing. Sure, reading books, watching videos and immersing myself in technical documentation is foundational, but I only feel like I really understand something after I've applied it in some practical way.

This is why I'm excited to announce that we've partnered with Wilco to create a new kind of learning experience for Atlas users. Wilco develops a platform, built to emulate the conditions at a tech startup, sends users on "quests" that cover everyday engineering tasks - from deploying an app to finding the root cause of a production issue - utilizing real-life tech stacks.

When I first heard about Wilco, I was immediately intrigued by the idea of using a simulation game to teach people about database migrations. Particularly interesting, is the possibility to present people with opportunities to solve all kinds of issues that many engineers only hit during a serious outage when there's very little room for error.

Introducing Wilco

When you use Wilco, you're put in the shoes of a software engineer at a fictional startup. You're given access to a made up chat system humoristically called "Snack" where you meet different characters that you can interact with. You are also given access to a real GitHub repo and the means to run a real development environment on your local machine.

Me, interacting with Keen, my team lead on Snack

The new Atlas quest on Wilco

In the first quest we're releasing today on Wilco, you will be tasked with planning a seemingly simple database migration to support the development of a new feature in your imaginary company's app. As your team starts to roll out the feature, you will quickly realize that there are some serious issues that must be resolved, and fast!

And now, without further ado, you can head over to start your quest!

What's next?

Have questions? Feedback? Feel free to reach out on our Discord server.

Exploring Database Schemas with Atlas

· 4 min read
Rotem Tamir
Building Atlas

Atlas is most commonly used for managing and applying schema changes to databases, but it can also be used for something else: exploring and understanding database schemas.

With inspection, Atlas connects to your database, analyzes its structure from the metadata tables, and creates a graph data structure that maps all the entities and relations within the database. Atlas can then take this graph and represent it in various formats for users to consume. In this post, I will present two such forms of representation: Entity Relationship Diagrams (ERDs) and JSON documents.

Schemas as ERDs

One of the most useful ways to represent a database schema is using an Entity Relationship Diagram (ERD). This allows developers to see the schema in a visual and intuitive way, making it easy to understand the relationships between different elements of the database. When using ERDs, because the data is presented in a graph format, you can easily navigate through the schema and see how different entities are connected. This can be especially useful when working with complex or large databases, as it allows you to quickly identify patterns and connections that might not be immediately obvious when looking at the raw data.

Using Explore to generate an ERD

To automatically generate an ERD from your database, you can use the Explore feature of Atlas Cloud. To visualize a schema using the Explore feature, you need to provide your database schema in one of two ways:

  1. Provide a connection string to your database. This will allow Atlas Cloud to connect to your database and automatically generate a schema from the metadata tables. Note: this method only works for databases that are publicly accessible via the internet.

  2. Provide the schema as an Atlas HCL file. If you have an existing Atlas project, you can use the atlas schema inspect command to generate the HCL file from your database.

    After installing Atlas, you can run the following command to generate the HCL representation of your database schema:

    # MySQL
    atlas schema inspect -u mysql://root:pass@localhost:3306/db_name

    # PostgreSQL
    atlas schema inspect postgres://postgres:pass@localhost:5432/db_name?sslmode=disable

Schemas as JSON documents

In addition to producing ERDs, Atlas can also produce a JSON document that represents the database schema. One of the key benefits of representing the database schema as a JSON document is that it allows you to use standard tools like jq to analyze the schema programmatically. jq is a popular command-line tool for working with JSON data, and it can be especially useful for exploring and manipulating the schema data generated by Atlas.

With jq, you can easily extract specific information from the schema, such as the names of all the tables in the database or the foreign key relationships between different entities. This makes it easy to write scripts or programs that can automatically analyze the schema and identify potential issues or opportunities for optimization.

To get the JSON representation of your database schema, you can use the atlas schema inspect command with a custom logging format:

atlas schema inspect -u '<url>' --format '{{ json . }}'

This will output the schema as a JSON document:

{
"schemas": [
{
"name": "test",
"tables": [
{
"name": "blog_posts",
"columns": [
{
"name": "id",
"type": "int"
},
{
"name": "title",
"type": "varchar(100)",
"null": true
},
// .. Truncated for brevity ..
]
}
]
}

Once your schema is represented as a JSON document, you can use jq to analyze it. For example, to get a list of all the tables that contain a foreign key, run:

atlas schema inspect -u '<url>' --format '{{ json . }}' | jq '.schemas[].tables[] | select(.foreign_keys | length > 0) | .name'

This will output:

"blog_posts"

Wrapping up

In this blog post, we demonstrated how Atlas can be used as a schema inspection and visualization tool, in addition to its more commonly known use as a schema migration tool. We showed how to use the Explore feature to create an ERD from your database schema, and how to use the atlas schema inspect command to generate a JSON document that can be analyzed using jq and other tools.

Have questions? Feedback? Feel free to reach out on our Discord server.

Picking a database migration tool for Go projects in 2023

· 7 min read

Most software projects are backed by a database, that's widely accepted. The schema for this database almost always evolves over time: requirements change, features are added, and so the application's model of the world must evolve. When this model evolves, the database's schema must change as well. No one wants to (or should) connect to their production database and apply changes manually, which is why we need tools to manage schema changes. Most ORMs have basic support, but eventually projects tend to outgrow them. This is when projects reach to choose a schema migration tool.

Many such tools exist, and it's hard to know which to choose. My goal in this article is to present 3 popular choices for migration tools for Go projects to help you make this decision.

By way of introduction (and full disclosure): my name is Pedro Henrique, I'm a software engineer from Brazil, and I've been a contributing member of the Ent/Atlas community for quite a while. I really love open-source and think there's room for a diverse range of tools in our ecosystem, so I will do my best to provide you with an accurate, respectful, and fair comparison of the tools.

golang-migrate - Created: 2014 GitHub Stars: 10.3k
Golang migrate is one of the most famous tools for handling database migrations. Golang migrate has support for many database drivers and migration sources, it takes a simple and direct approach for handling database migrations.

Goose - Created: 2012 GitHub Stars: 3.2k
Goose is a solid option when choosing a migration tool. Goose has support for the main database drivers and one of its main features is support for migrations written in Go and more control of the migrations application process.

Atlas - Created: 2021 GitHub Stars: 2.1k
Atlas is an open-source schema migration tool that supports a declarative workflow to schema migrations, making it a kind of "Terraform for databases". With Atlas, users can declare their desired schema and let Atlas automatically plan the migrations for them. In addition, Atlas supports classic versioned migration workflows, migration linting, and has a GitHub Actions integration.

Golang migrate

Golang migrate was initially created by Matt Kadenbach. In 2018 the project was handed over to Dale Hui, and today the project resides on the golang-migrate organization and is actively maintained, having 202 contributors.

One of Golang migrate's main strengths is the support for various database drivers. If your project uses a database driver that is not very popular, chances are that Golang migrate has a driver for it. For cases where your database is not supported, Golang migrate has a simple API for defining new database drivers. Databases supported by Golang migrate include: PostgreSQL, Redshift, Ql, Cassandra, SQLite, MySQL/MariaDB, Neo4j, MongoDB, Google Cloud Spanner, and more.

Another feature of Golang migrate is the support for different migrations sources, for cases where your migration scripts resides on custom locations or even remote servers.

Goose

Goose has a similar approach to Golang migrate. The project was initially created by Liam Staskawicz in 2012, and in 2016 Pressly created a fork improving the usage by adding support for migrations in Go, handling cases of migrations out of order and custom schemas for migration versioning. Today Goose has 80 contributors.

Goose only provides support for 7 database drivers, so if your project uses one of the main databases in the market, Goose should be a good fit. For migration sources, Goose allows only the filesystem, it's worth pointing out that with Go embed it is possible to embed the migration files on a custom binary. Goose's main difference from Golang migrate is the support for migrations written in Go, for cases where it is necessary to query the database during the migration. Goose allows for different types of migration versioning schemas, improving one key issue with Golang migrate.

Atlas

Atlas takes a completely different approach to Golang migrate or Goose. While both tools only focus on proving means of running and maintaining the migration directory, Atlas takes one step further and actually constructs a graph representing the different database entities from the migration directory contents, allowing for more complex scenarios and providing safety for migration operations.

Migrations in Atlas can be defined in two ways:

  • Versioned migrations are the classical style, where the migration contents are written by the developer using the database language.
  • Declarative migrations are more similar to Infrastructure-as-Code, where the schema is defined in a Terraform-like language and the migrations commands are calculated based on the current and desired state of the database. It's possible to use Atlas in a hybrid way as well, combining both styles, called Versioned Migration Authoring where the schema is defined in the Atlas language, but the Atlas engine is used to generate versioned migrations.

On top of Atlas's ability to load the migration directory as a graph of database entities, an entire infrastructure of static code analysis was built to provide warnings about dangerous or inefficient operations. This technique is called migration linting and can be integrated with the Atlas GitHub Action during CI.

In addition, if you would like to run your migrations using Terraform, Atlas has a Terraform provider as well.

Another key point that Atlas solves is handling migration integrity, which becomes a huge problem when working with multiple branches that all make schema changes. Atlas solves this problem by using an Integrity file. While we are on the topic of integrity, one key feature of Atlas is the support for running the migrations inside a transaction, unlike Goose during the process of migration. Atlas acquires a lock ensuring that only one migration happens at a time and the migration order/integrity is respected. For cases where problems are found, Atlas makes the troubleshooting process easier, allowing schema inspections, dry runs and providing helpful links to the common problems and solutions.

Feature comparison

FeatureGolang migrateGooseAtlas
Drivers supportedMain SQL and NoSQL databasesMain SQL databasesMain SQL databases
Migration sourcesLocal and remote SQL filesSQL and Go filesHCL and SQL files
Migrations typeVersionedVersionedVersioned and Declarative
Support for migrations in GoNoYesYes
Integrity checksNoNoYes
Migration out of orderNoPossible with hybrid versioningPossible calculating the directory hash
Lock supportYesNoYes
Use as CLIYesYesYes
Use as packageYesYesPartial support ¹
Versioned Migration AuthoringNoNoYes
Migration lintingNoNoYes
GitHub ActionNoNoYes
Terraform providerNoNoYes
  • 1: Atlas provides a few packages related to database operations, but the use is limited to complex cases and there is no package that provides migration usage out of the box.

Wrapping up

In this post we saw different strengths of each migration tool. We saw how Golang migrate has a great variety of database drivers and database sources, how Goose allows use to written migration in Go for the complexes migration scenarios and how Atlas makes the migration a complete different business, improving the safety of the migration operations and bringing concepts from others fields.

Atlas Terraform Provider v0.4.0: HashiCorp partnerships, versioned migrations and more

· 5 min read
Tran Minh Giau
Software Engineer

Introduction

Today we are very excited to announce the release of Atlas Terraform Provider v0.4.0. This release brings some exciting new features and improvements to the provider which we will describe in this post.

In addition, this release is the first to be published under our new partnership with HashiCorp as a Technology Partner. Atlas is sometimes described as a "Terraform for Databases", so we have high hopes that this partnership will help us to bring many opportunities to create better ways for integrating database schema management into IaC workflows.

What's new

When people first hear about integrating schema management into declarative workflows, many raise the concern that because making changes to the database is a high-risk operation, they would not trust a tool to do it automatically.

This is a valid concern, and this release contains three new features that we believe will help to address it:

  • SQL plan printing
  • Versioned migrations support
  • Migration safety verification

SQL Plan Printing

In previous versions of the provider, we displayed the plan as a textual diff showing which resources are added, removed or modified. With this version, the provider will also print the SQL statements that will be executed as part of the plan.

For example, suppose we have the following schema:

schema "market" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
comment = "A schema comment"
}

table "users" {
schema = schema.market
column "id" {
type = int
}
column "name" {
type = varchar(255)
}
primary_key {
columns = [
column.id
]
}
}

And our Terraform module looks like this:

terraform {
required_providers {
atlas = {
source = "ariga/atlas"
version = "0.4.0"
}
}
}

provider "atlas" {}

data "atlas_schema" "market" {
src = file("${path.module}/schema.hcl")
dev_db_url = "mysql://root:pass@localhost:3307"
}

resource "atlas_schema" "market" {
hcl = data.atlas_schema.market.hcl
url = "mysql://root:pass@localhost:3306"
dev_db_url = "mysql://root:pass@localhost:3307"
}

When we run terraform plan we will see the following output:

Plan: 1 to add, 0 to change, 0 to destroy.

│ Warning: Atlas Plan

│ with atlas_schema.market,
│ on main.tf line 17, in resource "atlas_schema" "market":
│ 17: resource "atlas_schema" "market" {

│ The following SQL statements will be executed:


│ -- add new schema named "market"
│ CREATE DATABASE `market`
│ -- create "users" table
│ CREATE TABLE `market`.`users` (`id` int NOT NULL, `name` varchar(255) NOT NULL, PRIMARY KEY (`id`)) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci

Versioned migrations

Atlas supports two types of workflows: Declarative and Versioned. With declarative workflows, the plan to migrate the database is generated automatically at runtime. Versioned migrations provide teams with a more controlled workflow where changes are planned, checked-in to source control and reviewed ahead of time. Until today, the Terraform provider only supported the declarative workflow. This release adds support for versioned migrations as well.

Suppose we have the following migration directory of two files:

20221101163823_create_users.sql
CREATE TABLE `users` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`age` bigint(20) NOT NULL,
`name` varchar(255) COLLATE utf8mb4_bin NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `age` (`age`)
);
h1:OlaV3+7xXEWc1uG/Ed2zICttHaS6ydHZmzI7Hpf2Fss=
20221101163823_create_users.sql h1:mZirkpXBoLLm+M73EbHo07muxclifb70fhWQFfqxjD4=

We can use the Terraform Atlas provider to apply this migration directory to a database:

terraform {
required_providers {
atlas = {
source = "ariga/atlas"
version = "0.4.0"
}
}
}

provider "atlas" {}

// The `atlas_migration` data source loads the current state of the given database
// with regard to the migration directory.
data "atlas_migration" "hello" {
dir = "migrations?format=atlas"
url = "mysql://root:pass@localhost:3306/hello"
}

// The `atlas_migration` resource applies the migration directory to the database.
resource "atlas_migration" "hello" {
dir = "migrations?format=atlas"
version = data.atlas_migration.hello.latest # Use latest to run all migrations
url = data.atlas_migration.hello.url
dev_url = "mysql://root:pass@localhost:3307/test"
}

Running terraform plan will show the following output:

data.atlas_migration.hello: Reading...
data.atlas_migration.hello: Read complete after 0s [id=migrations?format=atlas]

Terraform used the selected providers to generate the following execution plan.
Resource actions are indicated with the following symbols:
+ create

Terraform will perform the following actions:

# atlas_migration.hello will be created
+ resource "atlas_migration" "hello" {
+ dev_url = (sensitive value)
+ dir = "migrations?format=atlas"
+ id = (known after apply)
+ status = (known after apply)
+ url = (sensitive value)
+ version = "20221101163823"
}

Plan: 1 to add, 0 to change, 0 to destroy.

Linting

Atlas provides extensive support for linting database schemas. This release adds support for linting schemas as part of the Terraform plan. This means that you can now run terraform plan and see if there are any linting errors in your schema. This is especially useful when you are using the versioned migrations workflow, as you can now run terraform plan to see if there are any linting errors in your schema before you apply the changes.

Suppose we add the following migration:

20221101165036_change_unique.sql
ALTER TABLE users
DROP KEY age,
ADD CONSTRAINT NAME UNIQUE (`name`);

If we run terraform plan on the above schema, Terraform prints the following warning:


│ Warning: data dependent changes detected

│ with atlas_migration.hello,
│ on main.tf line 20, in resource "atlas_migration" "hello":
│ 20: resource "atlas_migration" "hello" {

│ File: 20221101165036_change_unique.sql

│ - MF101: Adding a unique index "NAME" on table "users" might fail in case column
│ "name" contains duplicate entries

Atlas detected that the migration may fail in case the column name contains duplicate entries! This is a very useful warning that can help you avoid unpredicted failed deployments. Atlas supports many more safety checks, which you can read about here.

Wrapping up

In this blogpost we have discussed three new features that were added to the Terraform Atlas provider that are designed to make it safer and more predictable to manage your database schemas with Terraform. We hope you will enjoy this release!

Have questions? Feedback? Feel free to reach out on our Discord server.

Migrate Multi-Tenant Environments With Atlas

· 8 min read
Ariel Mashraki
Building Atlas

Wikipedia defines Multi-tenancy as:

a software architecture in which a single instance of software runs on a server and serves multiple tenants.

In recent years, multitenancy has become a common topic in our industry as many organizations provide service to multiple customers using the same infrastructure. Multitenancy usually becomes an issue in software architecture because tenants often expect a decent level of isolation from one another.

In this post, I will go over different known approaches for achieving multi-tenancy and discuss the approach we took to build Ariga's cloud platform. In addition, I will demonstrate how we added built-in support for multi-tenant environments in Atlas to overcome some of the challenges we faced.

Introduction

Throughout the last few years, I have had the opportunity to implement multi-tenancy in various ways. Some of them might be familiar to you:

  1. A separate environment (deployment) per tenant, where isolation is achieved at both compute and data layers.
  2. A schema (named database) per tenant, where there is one environment for compute (e.g., a K8S cluster), but tenants are stored in different databases or schemas. Isolation is achieved at the data layer while compute resources are shared.
  3. One environment for all tenants, including the data layer. Typically, in this case, each table holds a tenant_id column that is used to filter statements by the tenant. Both data and compute layers are shared, with isolation achieved at the logical, database query level.

Each approach has pros and cons, but I want to briefly list the main reasons we chose to build our cloud platform based on the second option: schema per tenant.

  1. Management: Easily delete, backup tenants, and allow them to export their data without affecting others.
  2. Isolation: Limit credentials, connection pooling, and quotas per tenant. This way, one tenant cannot cause the database to choke and interrupt other tenants in case they share the same physical database.
  3. Security and data privacy: In case it is required, some tenants can be physically separated from others. For example, data can be stored in the tenant's AWS account, and the application can connect to it using a secure connection, like VPC peering in AWS.
  4. Code-maintenance: Most of the application code is written in a way that it is unaware of the multi-tenancy. In our case, there is one layer "at the top" that attaches the tenant connection to the context, and the API layer (e.g., GraphQL resolver) extracts the connection from the context to read/write data. As a result, we are not concerned that API changes will cross tenant boundaries.
  5. Migration: Schema changes can be executed first on "test tenants" and fail-fast in case of error.

The primary con to this approach was that there was no elegant way to execute migrations on multiple databases (N times) in Atlas. In the rest of the post, I'll cover how we solved this problem in Ariga and added built-in support for multi-tenancy in Atlas.

Atlas config file

Atlas provides a convenient way to describe and interact with multiple environments using project files. A project file is a file named atlas.hcl and contains one or more env blocks. For example:

atlas.hcl
env "local" {
url = "mysql://root:pass@:3306/"
migrations {
dir = "file://migrations"
}
}

env "prod" {
// ... a different env
}

Once defined, a project's environment can be worked against using the --env flag. For example:

atlas migrate apply --env local

The command above runs the schema apply against the database that is defined in the local environment.

Multi-Tenant environments

The Atlas configuration language provides a few capabilities adopted from Terraform to facilitate the definition of multi-tenant environments. The first is the for_each meta-argument that allows defining a single env block that is expanded to N instances, one for each tenant. For example:

atlas.hcl
variable "url" {
type = string
default = "mysql://root:pass@:3306/"
}

variable "tenants" {
type = list(string)
}

env "local" {
for_each = toset(var.tenants)
url = urlsetpath(var.url, each.value)
migration {
dir = "file://migrations"
}
}

The above configuration expects a list of tenants to be provided as a variable. This can be useful when the list of tenants is dynamic and can be injected into the Atlas command. The urlsetpath function is a helper function that sets the path of the database URL to the tenant name. For example, if url is set to mysql://root:pass@:3306/?param=value and the tenant name is tenant1, the resulting URL will be mysql://root:pass@:3306/tenant1?param=value.

The second capability is Data Sources. This option enables users to retrieve information stored in an external service or database. For the sake of this example, let's extend the configuration above to use the SQL data source to retrieve the list of tenants from the INFORMATION_SCHEMA in MySQL:

atlas.hcl
// The URL of the database we operate on.
variable "url" {
type = string
default = "mysql://root:pass@:3306/"
}

// Schemas that match this pattern will be considered tenants.
variable "pattern" {
type = string
default = "tenant_%"
}

data "sql" "tenants" {
url = var.url
query = <<EOS
SELECT `schema_name`
FROM `information_schema`.`schemata`
WHERE `schema_name` LIKE ?
EOS
args = [var.pattern]
}

env "local" {
for_each = toset(data.sql.tenants.values)
url = urlsetpath(var.url, each.value)
}

Example

Let's demonstrate how managing migrations in a multi-tenant architecture is made simple with Atlas.

1. Install Atlas

To download and install the latest release of the Atlas CLI, simply run the following in your terminal:

curl -sSf https://atlasgo.sh | sh

2. Create a migration directory with the following example content:

-- create "users" table
CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

3. Create two example tenants on a local database:

create database tenant_a8m;
create database tenant_rotemtam;

4. Run Atlas to execute the migration scripts on the tenants' databases:

atlas migrate apply --env local
tenant_a8m
Migrating to version 20220811074314 (2 migrations in total):

-- migrating version 20220811074144
-> CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
-- ok (36.803179ms)

-- migrating version 20220811074314
-> ALTER TABLE `users` ADD COLUMN `name` varchar(255) NOT NULL;
-- ok (26.184177ms)

-------------------------
-- 72.899146ms
-- 2 migrations
-- 2 sql statements
tenant_rotemtam
Migrating to version 20220811074314 (2 migrations in total):

-- migrating version 20220811074144
-> CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
-- ok (61.987153ms)

-- migrating version 20220811074314
-> ALTER TABLE `users` ADD COLUMN `name` varchar(255) NOT NULL;
-- ok (24.656515ms)

-------------------------
-- 95.233384ms
-- 2 migrations
-- 2 sql statements

Running the command again will not execute any migrations:

No migration files to execute
No migration files to execute

Migration logging

At Ariga, our services print structured logs (JSON) to feed our observability tools. That is why we felt obligated to add support for custom log formatting in Atlas. To continue the example from above, we present how we configure Atlas to emit JSON lines with the tenant name attached to them.

1. Add the log configuration to the local environment block:

atlas.hcl
env "local" {
for_each = toset(data.sql.tenants.values)
url = urlsetpath(var.url, each.value)
// Emit JSON logs to stdout and add the
// tenant name to each log line.
format {
migrate {
apply = format(
"{{ json . | json_merge %q }}",
jsonencode({
Tenant : each.value
})
)
}
}
}

2. Create a new script file in the migration directory:

-- create "users" table
CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

3. Run migrate apply in our "local" environment:

atlas migrate apply --env local
{"Applied":[{"Applied":["CREATE TABLE `pets` (`id` bigint, PRIMARY KEY (`id`));"],"Description":"create_pets","End":"2022-10-27T16:03:03.685899+03:00","Name":"20221027125605_create_pets.sql","Start":"2022-10-27T16:03:03.655879+03:00","Version":"20221027125605"}],"Current":"20220811074314","Dir":"migrations","Driver":"mysql","End":"2022-10-27T16:03:03.685899+03:00","Pending":[{"Description":"create_pets","Name":"20221027125605_create_pets.sql","Version":"20221027125605"}],"Start":"2022-10-27T16:03:03.647091+03:00","Target":"20221027125605","Tenant":"tenant_a8m","URL":{"ForceQuery":false,"Fragment":"","Host":":3308","OmitHost":false,"Opaque":"","Path":"/tenant_a8m","RawFragment":"","RawPath":"","RawQuery":"parseTime=true","Schema":"tenant_a8m","Scheme":"mysql","User":{}}}
{"Applied":[{"Applied":["CREATE TABLE `pets` (`id` bigint, PRIMARY KEY (`id`));"],"Description":"create_pets","End":"2022-10-27T16:03:03.787476+03:00","Name":"20221027125605_create_pets.sql","Start":"2022-10-27T16:03:03.757463+03:00","Version":"20221027125605"}],"Current":"20220811074314","Dir":"migrations","Driver":"mysql","End":"2022-10-27T16:03:03.787476+03:00","Pending":[{"Description":"create_pets","Name":"20221027125605_create_pets.sql","Version":"20221027125605"}],"Start":"2022-10-27T16:03:03.748399+03:00","Target":"20221027125605","Tenant":"tenant_rotemtam","URL":{"ForceQuery":false,"Fragment":"","Host":":3308","OmitHost":false,"Opaque":"","Path":"/tenant_rotemtam","RawFragment":"","RawPath":"","RawQuery":"parseTime=true","Schema":"tenant_rotemtam","Scheme":"mysql","User":{}}}

Next steps

Currently, Atlas uses a fail-fast policy, which means the process exits on the first tenant that returns an error. We built it this way because we find it helpful to execute migrations first on "test tenants" and stop in case the operation fails on any of them. However, this means the execution is serial and may be slow in cases where there is a large amount of tenants. Therefore, we aim to add more advanced approaches that will allow executing the first M tenants serially and the rest of the N-M tenants in parallel.

Have questions? Feedback? Feel free to reach out on our Discord server.

The Atlas Migration Execution Engine

· 8 min read
Jannik Clausen
Building Atlas

With the release of v0.6.0, we introduced a workflow for managing changes to database schemas that we have called: Versioned Migration Authoring.

Today, we released the first version of the Atlas migration execution engine, that can apply migration files on your database. In this post, we will give a brief overview of the features and what to expect in the future.

Migration File Format

The Atlas migration filename format follows a very simple structure: version_[name].sql, with the name being optional. version can be an arbitrary string. Migration files are lexicographically sorted by filename.

↪ tree .
.
├── 1_initial.sql
├── 2_second.sql
├── 3_third.sql
└── atlas.sum

0 directories, 4 files

If you want to follow along, you can simply copy and paste the above files in a folder on your system. Make sure you have a database ready to work on. You can start an ephemeral docker container with the following command:

# Run a local mysql container listening on port 3306.
docker run --rm --name atlas-apply --detach --env MYSQL_ROOT_PASSWORD=pass -p 3306:3306 mysql:8

Apply Migrations

In order to apply migrations you need to have the Atlas CLI in version v0.7.0 or above. Follow the installation instructions if you don't have Atlas installed yet.

Now, to apply the first migration of our migration directory, we call atlas migrate apply and pass in some configuration parameters.

atlas migrate apply 1 \
--dir "file://migrations" \
--url "mysql://root:pass@localhost:3306/"
Migrating to version 1 (1 migrations in total):

-- migrating version 1
-> CREATE DATABASE `my_schema`;
-> CREATE TABLE `my_schema`.`tbl` (`col` int NOT NULL);
-- ok (17.247319ms)

-------------------------
-- 18.784204ms
-- 1 migrations
-- 2 sql statements

Migration Status

Atlas saves information about the database schema revisions (applied migration versions) in a special table called atlas_schema_revisions. In the example above we connected to the database without specifying which schema to operate against. For this reason, Atlas created the revision table in a new schema called atlas_schema_revisions. For a schema-bound connection Atlas will put the table into the connected schema. We will see that in a bit.

Go ahead and call atlas migrate status to gather information about the database migration state:

atlas migrate status \
--dir "file://migrations" \
--url "mysql://root:pass@localhost:3306/"
Migration Status: PENDING
-- Current Version: 1
-- Next Version: 2
-- Executed Files: 1
-- Pending Files: 2

This output tells us that the last applied version is 1, the next one is called 2 and that we still have two migrations pending. Let's apply the pending migrations:

Note, that we do not pass an argument to the apply, in which case Atlas will attempt to apply all pending migrations.

atlas migrate apply \
--dir "file://migrations" \
--url "mysql://root:pass@localhost:3306/"
Migrating to version 3 from 1 (2 migrations in total):

-- migrating version 2
-> ALTER TABLE `my_schema`.`tbl` ADD `col_2` TEXT;
-- ok (13.98847ms)

-- migrating version 3
-> CREATE TABLE `tbl_2` (`col` int NOT NULL);
Error 1046: No database selected

-------------------------
-- 15.604338ms
-- 1 migrations ok (1 with errors)
-- 1 sql statements ok (1 with errors)

Error: Execution had errors: Error 1046: No database selected

Error: sql/migrate: executing statement "CREATE TABLE `tbl_2` (`col` int NOT NULL);" from version "3": Error 1046: No database selected

What happened here? After further investigation, you will find that our connection URL is bound to the entire database, not to a schema. The third migration file however does not contain a schema qualifier for the CREATE TABLE statement.

By default, Atlas wraps the execution of each migration file into one transaction. This transaction gets rolled back if any error occurs withing execution. Be aware though, that some databases, such as MySQL and MariaDB, don't support transactional DDL. If you want to learn how to configure the way Atlas uses transactions, have a look at the docs.

Migration Retry

To resolve this edit the migration file and add a qualifier to the statement:

CREATE TABLE `my_schema`.`tbl_2` (`col` int NOT NULL);

Since you changed the contents of a migration file, we have to re-calculate the directory integrity hash-sum by calling:

atlas migrate hash --force \
--dir "file://migrations"

Then we can proceed and simply attempt to execute the migration file again.

atlas migrate apply \
--dir "file://migrations" \
--url "mysql://root:pass@localhost:3306/"
Migrating to version 3 from 2 (1 migrations in total):

-- migrating version 3
-> CREATE TABLE `my_schema`.`tbl_2` (`col` int NOT NULL);
-- ok (15.168892ms)

-------------------------
-- 16.741173ms
-- 1 migrations
-- 1 sql statements

Attempting to migrate again or calling atlas migrate status will tell us that all migrations have been applied onto the database and there is nothing to do at the moment.

atlas migrate apply \
--dir "file://migrations" \
--url "mysql://root:pass@localhost:3306/"
No migration files to execute

Moving an existing project to Atlas with Baseline Migrations

Another common scenario is when you need to move an existing project to Atlas. To do so, create an initial migration file reflecting the current state of a database schema by using atlas migrate diff. A very simple way to do so would be by heading over to the database from before, deleting the atlas_schema_revisions schema, emptying your migration directory and running the atlas migrate diff command.

rm -rf migrations
docker exec atlas-apply mysql -ppass -e "CREATE SCHEMA `my_schema_dev`;" # create a dev-db
docker exec atlas-apply mysql -ppass -e "DROP SCHEMA `atlas_schema_revisions`;"
atlas migrate diff \
--dir "file://migrations" \
--to "mysql://root:pass@localhost:3306/my_schema" \
--dev-url "mysql://root:pass@localhost:3306/my_schema_dev"

To demonstrate that Atlas can also work on a schema level instead of a realm connection, we are running on a connection bound to the my_schema schema this time.

You should end up with the following migration directory:

-- create "tbl" table
CREATE TABLE `tbl` (`col` int NOT NULL, `col_2` text NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
-- create "tbl_2" table
CREATE TABLE `tbl_2` (`col` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Now, let's create a new migration file to create a table tbl_3 and update the directory integrity file.

atlas migrate new add_table --dir "file://migrations"
echo "CREATE TABLE `tbl_3` (`col` text NULL);" >> migrations/$(ls -t migrations | head -n1)
atlas migrate hash --force --dir "file://migrations"

Since we now have both a migration file representing our current database state and the new migration file to apply, we can make use of the --baseline flag:

atlas migrate apply \
--dir "file://migrations" \
--url "mysql://root:pass@localhost:3306/my_schema" \
--baseline "20220908110527" # replace the version with the one generated by you
Migrating to version 20220908110847 from 20220908110527 (1 migrations in total):

-- migrating version 20220908110847
-> CREATE TABLE `tbl_3` (`col` text NULL);
-- ok (14.325493ms)

-------------------------
-- 15.786455ms
-- 1 migrations
-- 1 sql statements

Outlook

The Atlas migration engine is powering Ent and the execution engine is already being used within Ariga for several months. We will continue working on improving it, releasing cool features, such as assisted troubleshooting for failed migrations, a more intelligent, dialect-aware execution planning for things like MySQLs implicits commits and more.

Wrapping up

In this post we learned about the new migration execution engine of Atlas and some information about its internals.

Further reading

To learn more about Versioned Migration Authoring:

Have questions? Feedback? Find our team on our Discord server.

Prevent destructive changes to your database with the Atlas GitHub Action

· 5 min read
Rotem Tamir
Building Atlas

Losing data is painful for almost all organizations. This is one of the reasons teams are very cautious when it comes to making changes to their databases. In fact, many teams set explicit policies on what kinds of changes to the database are allowed, often completely prohibiting any change that is destructive.

Destructive changes are changes to a database schema that result in loss of data. For instance, consider a statement such as:

ALTER TABLE `users` DROP COLUMN `email_address`;

This statement is considered destructive because whatever data is stored in the email_address column will be deleted from disk, with no way to recover it.

Suppose you were in charge of a team that decided to prohibit destructive changes, how would you go about enforcing such a policy? From our experience, most teams enforce policies relating to schema migrations in code-review: a human engineer, preferably with some expertise in operating databases, manually reviews any proposed database migration scripts and rejects them if they contain destructive changes.

Relying on a human reviewer to enforce such a policy is both expensive (it takes time and mental energy) and error-prone. Just like manual QA is slowly being replaced with automated testing, and manual code style reviews are being replaced with linters, isn't it time that we automate the process of ensuring that changes to database schemas are safe?

Announcing the Atlas GitHub Action

Today, we're happy to announce the release of the official Atlas GitHub Action which can be used to apply migration directory linting for a bunch of popular database migration tools. golang-migrate, goose, dbmate and Atlas itself are already supported, and Flyway and Liquibase are coming soon.

If you're using GitHub to manage your source code, you're in luck. By adding a short configuration file to your repository, you can start linting your schema migration scripts today! Let's see a short example.

Setting up

Suppose we are running a website for an e-commerce business. To store the data for our website we use a MySQL database. Because the data in this database is everything to us, we use a careful versioned migrations approach where each change to the database schema is described in an SQL script and stored in our Git repository. To execute these scripts we use a popular tool called golang-migrate.

The source code for this example can be found in rotemtam/atlas-action-demo.

Initially, our schema contains two tables: users and orders, documented in the first few migration files:

Create the users table:

-- create "users" table
CREATE TABLE `users` (
`id` int NOT NULL,
`name` varchar(100) NULL,
PRIMARY KEY (`id`)
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Add a unique email column:

ALTER TABLE `users` ADD COLUMN `email` varchar(255) NOT NULL, ADD UNIQUE INDEX `email_unique` (`email`);

Create the orders table, with a foreign-key referencing the users table:

-- create "orders" table
CREATE TABLE `orders` (
`id` int NOT NULL,
`user_id` int NOT NULL,
`total` decimal(10) NOT NULL,
PRIMARY KEY (`id`),
INDEX `user_orders` (`user_id`),
CONSTRAINT `user_orders` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON UPDATE NO ACTION ON DELETE NO ACTION
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Installing the Atlas Action

To make sure we never accidentally delete data during schema changes, we enact a policy that prohibits destructive changes to the database. To enforce this policy, we invoke the atlas-action GitHub Action from within our continuous integration flow by adding a workflow file name .github/workflows/atlas-ci.yaml:

name: Atlas CI
on:
# Run whenever code is changed in the master branch,
# change this to your root branch.
push:
branches:
- master
# Run on PRs where something changed under the `path/to/migration/dir/` directory.
pull_request:
paths:
- 'migrations/*'
jobs:
lint:
services:
# Spin up a mysql:8.0.29 container to be used as the dev-database for analysis.
mysql:
image: mysql:8.0.29
env:
MYSQL_ROOT_PASSWORD: pass
MYSQL_DATABASE: test
ports:
- "3306:3306"
options: >-
--health-cmd "mysqladmin ping -ppass"
--health-interval 10s
--health-start-period 10s
--health-timeout 5s
--health-retries 10
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3.0.1
with:
fetch-depth: 0 # Mandatory unless "latest" is set below.
- uses: ariga/atlas-action@v0
with:
dir: migrations/
dir-format: golang-migrate # Or: atlas, goose, dbmate
dev-url: mysql://root:pass@localhost:3306/test

Detecting a destructive change

Next, let's see what happens when a developer accidentally proposes a destructive change, to drop a column in the orders table:

-- modify "orders" table
ALTER TABLE `orders` DROP COLUMN `total`;

This change is proposed in PR #1 in our example repo. Because we have previously set up the Atlas GitHub Action to lint our migration directory, whenever a file changes under the migrations/ directory, a workflow is triggered.

After letting our workflow complete, observe that GitHub informs us that the Atlas CI / lint check has failed:

Clicking on the "details" link we find a detailed explanation on the causes for the failure:

Examining the Action run summary we find the following annotation:

As you can see, Atlas has detected the destructive change we proposed to apply to our database and failed our build!

Wrapping up

In this post we discussed why many teams set policies to prevent destructive changes to database schemas. We further showed how such policies can be enforced in an automated way using the official Atlas GitHub Action.

Further reading

To learn more about CI for database schema changes:

Have questions? Feedback? Find our team on our Discord server.

Announcing v0.6.0 with Versioned Migration Authoring

· 8 min read
Ariel Mashraki
Building Atlas

With the release of v0.6.0, we are happy to announce official support for a style of workflow for managing changes to database schemas that we have been experimenting with in the past months: Versioned Migration Authoring.

TL;DR

  • Atlas supports a declarative workflow (similar to Terraform) where users provide the desired database schema in a simple data definition language and Atlas calculates a plan to get a target database to that state. This workflow is supported by the schema apply command.
  • Many teams prefer a more imperative approach where each change to the database schema is checked-in to source control and reviewed during code-review. This type of workflow is commonly called versioned migrations (or change based migrations) and is supported by many established tools such as Flyway and Liquibase.
  • The downside of the versioned migration approach is, of course, that it puts the burden of planning the migration on developers. As part of the Atlas project we advocate for a third combined approach that we call "Versioned Migration Authoring".
  • Versioned Migration Authoring is an attempt to combine the simplicity and expressiveness of the declarative approach with the control and explicitness of versioned migrations.
  • To use Versioned Migration Authoring today, use the atlas migrate diff command. See the Getting Started section below for instructions.

Declarative Migrations

The declarative approach has become increasingly popular with engineers nowadays because it embodies a convenient separation of concerns between application and infrastructure engineers. Application engineers describe what (the desired state) they need to happen, and infrastructure engineers build tools that plan and execute ways to get to that state (how). This division of labor allows for great efficiencies as it abstracts away the complicated inner workings of infrastructure behind a simple, easy to understand API for the application developers and allows for specialization and development of expertise to pay off for the infra people.

With declarative migrations, the desired state of the database schema is given as input to the migration engine, which plans and executes a set of actions to change the database to its desired state.

For example, suppose your application uses a small SQLite database to store its data. In this database, you have a users table with this structure:

schema "main" {}

table "users" {
schema = schema.main
column "id" {
type = int
}
column "greeting" {
type = text
}
}

Now, suppose that you want to add a default value of "shalom" to the greeting column. Many developers are not aware that it isn't possible to modify a column's default value in an existing table in SQLite. Instead, the common practice is to create a new table, copy the existing rows into the new table and drop the old one after. Using the declarative approach, developers can change the default value for the greeting column:

schema "main" {}

table "users" {
schema = schema.main
column "id" {
type = int
}
column "greeting" {
type = text
default = "shalom"
}
}

And have Atlas's engine devise a plan similar to this:

-- Planned Changes:
-- Create "new_users" table
CREATE TABLE `new_users` (`id` int NOT NULL, `greeting` text NOT NULL DEFAULT 'shalom')
-- Copy rows from old table "users" to new temporary table "new_users"
INSERT INTO `new_users` (`id`, `greeting`) SELECT `id`, IFNULL(`greeting`, 'shalom') AS `greeting` FROM `users`
-- Drop "users" table after copying rows
DROP TABLE `users`
-- Rename temporary table "new_users" to "users"
ALTER TABLE `new_users` RENAME TO `users`

Versioned Migrations

As the database is one of the most critical components in any system, applying changes to its schema is rightfully considered a dangerous operation. For this reason, many teams prefer a more imperative approach where each change to the database schema is checked-in to source control and reviewed during code-review. Each such change is called a "migration", as it migrates the database schema from the previous version to the next. To support this kind of requirement, many popular database schema management tools such as Flyway, Liquibase or golang-migrate support a workflow that is commonly called "versioned migrations".

In addition to the higher level of control which is provided by versioned migrations, applications are often deployed to multiple remote environments at once. These environments are not controlled (or even accessible) by the development team. In such cases, declarative migrations, which rely on a network connection to the target database and on human approval of migrations plans in real-time, are not a feasible strategy.

With versioned migrations (sometimes called "change-based migrations"), instead of describing the desired state ("what the database should look like"), developers describe the changes themselves ("how to reach the state"). Most of the time, this is done by creating a set of SQL files containing the statements needed. Each of the files is assigned a unique version and a description of the changes. Tools like the ones mentioned earlier are then able to interpret the migration files and to apply (some of) them in the correct order to transition to the desired database structure.

The benefit of the versioned migrations approach is that it is explicit: engineers know exactly what queries are going to be run against the database when the time comes to execute them. Because changes are planned ahead of time, migration authors can control precisely how to reach the desired schema. If we consider a migration as a plan to get from state A to state B, oftentimes multiple paths exist, each with a very different impact on the database. To demonstrate, consider an initial state which contains a table with two columns:

CREATE TABLE users (
id int,
name varchar(255)
);

Suppose our desired state is:

CREATE TABLE users (
id int,
user_name varchar(255)
);

There are at least two ways get from the initial to the desired state:

  • Drop the name column and create a new user_name column.
  • Alter the name of the name column to user_name.

Depending on the context, either may be the desired outcome for the developer planning the change. With versioned migrations, engineers have the ultimate confidence of what change is going to happen, which may not be known ahead of time in a declarative approach.

Migration Authoring

The downside of the versioned migration approach is, of course, that it puts the burden of planning the migration on developers. This requires a certain level of expertise that is not always available to every engineer, as we demonstrated in our example of setting a default value in a SQLite database above.

As part of the Atlas project we advocate for a third combined approach that we call "Versioned Migration Authoring". Versioned Migration Authoring is an attempt to combine the simplicity and expressiveness of the declarative approach with the control and explicitness of versioned migrations.

With versioned migration authoring, users still declare their desired state and use the Atlas engine to plan a safe migration from the existing to the new state. However, instead of coupling planning and execution, plans are instead written into normal migration files which can be checked-in to source control, fine-tuned manually and reviewed in regular code review processes.

Getting started

Start by downloading the Atlas CLI:

To download and install the latest release of the Atlas CLI, simply run the following in your terminal:

curl -sSf https://atlasgo.sh | sh

Next, define a simple Atlas schema with one table and an empty migration directory:

schema.hcl
schema "test" {}

table "users" {
schema = schema.test
column "id" {
type = int
}
}

Let's run atlas migrate diff with the necessary parameters to generate a migration script for creating our users table:

  • --dir the URL to the migration directory, by default it is file://migrations.
  • --to the URL of the desired state, an HCL file or a database connection.
  • --dev-url a URL to a Dev Database that will be used to compute the diff.
atlas migrate diff create_users \
--dir="file://migrations" \
--to="file://schema.hcl" \
--dev-url="mysql://root:pass@:3306/test"

Observe that two files were created in the migrations directory:

By default, migration files are named with the following format {{ now }}_{{ name }}.sql. If you wish to use a different file format, use the --dir-format option.

-- create "users" table
CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Further reading

To learn more about Versioned Migration Authoring:

Have questions? Feedback? Find our team on our Discord server.