Skip to main content

· 5 min read
Rotem Tamir

Losing data is painful for almost all organizations. This is one of the reasons teams are very cautious when it comes to making changes to their databases. In fact, many teams set explicit policies on what kinds of changes to the database are allowed, often completely prohibiting any change that is destructive.

Destructive changes are changes to a database schema that result in loss of data. For instance, consider a statement such as:

ALTER TABLE `users` DROP COLUMN `email_address`;

This statement is considered destructive because whatever data is stored in the email_address column will be deleted from disk, with no way to recover it.

Suppose you were in charge of a team that decided to prohibit destructive changes, how would you go about enforcing such a policy? From our experience, most teams enforce policies relating to schema migrations in code-review: a human engineer, preferably with some expertise in operating databases, manually reviews any proposed database migration scripts and rejects them if they contain destructive changes.

Relying on a human reviewer to enforce such a policy is both expensive (it takes time and mental energy) and error-prone. Just like manual QA is slowly being replaced with automated testing, and manual code style reviews are being replaced with linters, isn't it time that we automate the process of ensuring that changes to database schemas are safe?

Announcing the Atlas GitHub Action

Today, we're happy to announce the release of the official Atlas GitHub Action which can be used to apply migration directory linting for a bunch of popular database migration tools. golang-migrate, goose, dbmate and Atlas itself are already supported, and Flyway and Liquibase are coming soon.

If you're using GitHub to manage your source code, you're in luck. By adding a short configuration file to your repository, you can start linting your schema migration scripts today! Let's see a short example.

Setting up

Suppose we are running a website for an e-commerce business. To store the data for our website we use a MySQL database. Because the data in this database is everything to us, we use a careful versioned migrations approach where each change to the database schema is described in an SQL script and stored in our Git repository. To execute these scripts we use a popular tool called golang-migrate.

The source code for this example can be found in rotemtam/atlas-action-demo.

Initially, our schema contains two tables: users and orders, documented in the first few migration files:

Create the users table:

migrations/20220819060736.up.sql
-- create "users" table
CREATE TABLE `users` (
`id` int NOT NULL,
`name` varchar(100) NULL,
PRIMARY KEY (`id`)
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Add a unique email column:

migrations/20220819061056.up.sql
ALTER TABLE `users` ADD COLUMN `email` varchar(255) NOT NULL, ADD UNIQUE INDEX `email_unique` (`email`);

Create the orders table, with a foreign-key referencing the users table:

migrations/20220819075145.up.sql
-- create "orders" table
CREATE TABLE `orders` (
`id` int NOT NULL,
`user_id` int NOT NULL,
`total` decimal(10) NOT NULL,
PRIMARY KEY (`id`),
INDEX `user_orders` (`user_id`),
CONSTRAINT `user_orders` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON UPDATE NO ACTION ON DELETE NO ACTION
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Installing the Atlas Action

To make sure we never accidentally delete data during schema changes, we enact a policy that prohibits destructive changes to the database. To enforce this policy, we invoke the atlas-action GitHub Action from within our continuous integration flow by adding a workflow file name .github/workflows/atlas-ci.yaml:

.github/workflows/atlas-ci.yaml
name: Atlas CI
on:
# Run whenever code is changed in the master branch,
# change this to your root branch.
push:
branches:
- master
# Run on PRs where something changed under the `path/to/migration/dir/` directory.
pull_request:
paths:
- 'migrations/*'
jobs:
lint:
services:
# Spin up a mysql:8.0.29 container to be used as the dev-database for analysis.
mysql:
image: mysql:8.0.29
env:
MYSQL_ROOT_PASSWORD: pass
MYSQL_DATABASE: test
ports:
- "3306:3306"
options: >-
--health-cmd "mysqladmin ping -ppass"
--health-interval 10s
--health-start-period 10s
--health-timeout 5s
--health-retries 10
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3.0.1
with:
fetch-depth: 0 # Mandatory unless "latest" is set below.
- uses: ariga/atlas-action@v0
with:
dir: migrations/
dir-format: golang-migrate # Or: atlas, goose, dbmate
dev-url: mysql://root:pass@localhost:3306/test

Detecting a destructive change

Next, let's see what happens when a developer accidentally proposes a destructive change, to drop a column in the orders table:

migrations/20220819081116.up.sql
-- modify "orders" table
ALTER TABLE `orders` DROP COLUMN `total`;

This change is proposed in PR #1 in our example repo. Because we have previously set up the Atlas GitHub Action to lint our migration directory, whenever a file changes under the migrations/ directory, a workflow is triggered.

After letting our workflow complete, observe that GitHub informs us that the Atlas CI / lint check has failed:

Clicking on the "details" link we find a detailed explanation on the causes for the failure:

Examining the Action run summary we find the following annotation:

As you can see, Atlas has detected the destructive change we proposed to apply to our database and failed our build!

Wrapping up

In this post we discussed why many teams set policies to prevent destructive changes to database schemas. We further showed how such policies can be enforced in an automated way using the official Atlas GitHub Action.

Further reading

To learn more about CI for database schema changes:

Have questions? Feedback? Find our team on our Discord server.

· 8 min read
Ariel Mashraki

With the release of v0.6.0, we are happy to announce official support for a style of workflow for managing changes to database schemas that we have been experimenting with in the past months: Versioned Migration Authoring.

TL;DR

  • Atlas supports a declarative workflow (similar to Terraform) where users provide the desired database schema in a simple data definition language and Atlas calculates a plan to get a target database to that state. This workflow is supported by the schema apply command.
  • Many teams prefer a more imperative approach where each change to the database schema is checked-in to source control and reviewed during code-review. This type of workflow is commonly called versioned migrations (or change based migrations) and is supported by many established tools such as Flyway and Liquibase.
  • The downside of the versioned migration approach is, of course, that it puts the burden of planning the migration on developers. As part of the Atlas project we advocate for a third combined approach that we call "Versioned Migration Authoring".
  • Versioned Migration Authoring is an attempt to combine the simplicity and expressiveness of the declarative approach with the control and explicitness of versioned migrations.
  • To use Versioned Migration Authoring today, use the atlas migrate diff command. See the Getting Started section below for instructions.

Declarative Migrations

The declarative approach has become increasingly popular with engineers nowadays because it embodies a convenient separation of concerns between application and infrastructure engineers. Application engineers describe what (the desired state) they need to happen, and infrastructure engineers build tools that plan and execute ways to get to that state (how). This division of labor allows for great efficiencies as it abstracts away the complicated inner workings of infrastructure behind a simple, easy to understand API for the application developers and allows for specialization and development of expertise to pay off for the infra people.

With declarative migrations, the desired state of the database schema is given as input to the migration engine, which plans and executes a set of actions to change the database to its desired state.

For example, suppose your application uses a small SQLite database to store its data. In this database, you have a users table with this structure:

schema "main" {}

table "users" {
schema = schema.main
column "id" {
type = int
}
column "greeting" {
type = text
}
}

Now, suppose that you want to add a default value of "shalom" to the greeting column. Many developers are not aware that it isn't possible to modify a column's default value in an existing table in SQLite. Instead, the common practice is to create a new table, copy the existing rows into the new table and drop the old one after. Using the declarative approach, developers can change the default value for the greeting column:

schema "main" {}

table "users" {
schema = schema.main
column "id" {
type = int
}
column "greeting" {
type = text
default = "shalom"
}
}

And have Atlas's engine devise a plan similar to this:

-- Planned Changes:
-- Create "new_users" table
CREATE TABLE `new_users` (`id` int NOT NULL, `greeting` text NOT NULL DEFAULT 'shalom')
-- Copy rows from old table "users" to new temporary table "new_users"
INSERT INTO `new_users` (`id`, `greeting`) SELECT `id`, IFNULL(`greeting`, 'shalom') AS `greeting` FROM `users`
-- Drop "users" table after copying rows
DROP TABLE `users`
-- Rename temporary table "new_users" to "users"
ALTER TABLE `new_users` RENAME TO `users`

Versioned Migrations

As the database is one of the most critical components in any system, applying changes to its schema is rightfully considered a dangerous operation. For this reason, many teams prefer a more imperative approach where each change to the database schema is checked-in to source control and reviewed during code-review. Each such change is called a "migration", as it migrates the database schema from the previous version to the next. To support this kind of requirement, many popular database schema management tools such as Flyway, Liquibase or golang-migrate support a workflow that is commonly called "versioned migrations".

In addition to the higher level of control which is provided by versioned migrations, applications are often deployed to multiple remote environments at once. These environments are not controlled (or even accessible) by the development team. In such cases, declarative migrations, which rely on a network connection to the target database and on human approval of migrations plans in real-time, are not a feasible strategy.

With versioned migrations (sometimes called "change-based migrations"), instead of describing the desired state ("what the database should look like"), developers describe the changes themselves ("how to reach the state"). Most of the time, this is done by creating a set of SQL files containing the statements needed. Each of the files is assigned a unique version and a description of the changes. Tools like the ones mentioned earlier are then able to interpret the migration files and to apply (some of) them in the correct order to transition to the desired database structure.

The benefit of the versioned migrations approach is that it is explicit: engineers know exactly what queries are going to be run against the database when the time comes to execute them. Because changes are planned ahead of time, migration authors can control precisely how to reach the desired schema. If we consider a migration as a plan to get from state A to state B, oftentimes multiple paths exist, each with a very different impact on the database. To demonstrate, consider an initial state which contains a table with two columns:

CREATE TABLE users (
id int,
name varchar(255)
);

Suppose our desired state is:

CREATE TABLE users (
id int,
user_name varchar(255)
);

There are at least two ways get from the initial to the desired state:

  • Drop the name column and create a new user_name column.
  • Alter the name of the name column to user_name.

Depending on the context, either may be the desired outcome for the developer planning the change. With versioned migrations, engineers have the ultimate confidence of what change is going to happen, which may not be known ahead of time in a declarative approach.

Migration Authoring

The downside of the versioned migration approach is, of course, that it puts the burden of planning the migration on developers. This requires a certain level of expertise that is not always available to every engineer, as we demonstrated in our example of setting a default value in a SQLite database above.

As part of the Atlas project we advocate for a third combined approach that we call "Versioned Migration Authoring". Versioned Migration Authoring is an attempt to combine the simplicity and expressiveness of the declarative approach with the control and explicitness of versioned migrations.

With versioned migration authoring, users still declare their desired state and use the Atlas engine to plan a safe migration from the existing to the new state. However, instead of coupling planning and execution, plans are instead written into normal migration files which can be checked-in to source control, fine-tuned manually and reviewed in regular code review processes.

Getting started

Start by downloading the Atlas CLI:

To download and install the latest release of the Atlas CLI, simply run the following in your terminal:

curl -sSf https://atlasgo.sh | sh

Next, define a simple Atlas schema with one table and an empty migration directory:

schema.hcl
schema "test" {}

table "users" {
schema = schema.test
column "id" {
type = int
}
}

Let's run atlas migrate diff with the necessary parameters to generate a migration script for creating our users table:

  • --dir the URL to the migration directory, by default it is file://migrations.
  • --to the URL of the desired state, an HCL file or a database connection.
  • --dev-url a URL to a Dev Database that will be used to compute the diff.
atlas migrate diff create_users \
--dir="file://migrations" \
--to="file://schema.hcl" \
--dev-url="mysql://root:pass@:3306/test"

Observe that two files were created in the migrations directory:

By default, migration files are named with the following format {{ now }}_{{ name }}.sql. If you wish to use a different file format, use the --dir-format option.

-- create "users" table
CREATE TABLE `users` (`id` int NOT NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Further reading

To learn more about Versioned Migration Authoring:

Have questions? Feedback? Find our team on our Discord server.

· 4 min read
Rotem Tamir

With the release of v0.5.0, we are happy to announce a very significant milestone for the project. While this version includes some cool features (such as multi-file schemas) and a swath of incremental improvements and bugfixes, there is one feature that we're particularly excited about and want to share with you in this post.

As most outages happen directly as a result of a change to a system, Atlas provides users with the means to verify the safety of planned changes before they happen. The sqlcheck package provides interfaces for analyzing the contents of SQL files to generate insights on the safety of many kinds of changes to database schemas. With this package, developers may define an Analyzer that can be used to diagnose the impact of SQL statements on the target database.

This functionality is exposed to CLI users via the migrate lint subcommand. By utilizing the sqlcheck package, Atlas can now check your migration directory for common problems and issues.

atlas migrate lint in action

Recall that Atlas uses a dev database to plan and simulate schema changes. Let's start by spinning up a container that will serve as our dev database:

docker run --name atlas-db-dev -d -p 3307:3306 -e MYSQL_ROOT_PASSWORD=pass  mysql

Next let's create schema.hcl, the HCL file which will contain the desired state of our database:

schema.hcl
schema "example" {
}
table "users" {
schema = schema.example
column "id" {
type = int
}
column "name" {
type = varchar(255)
}
primary_key {
columns = [
column.id
]
}
}

To simplify the commands we need to type in this demo, let's create an Atlas project file to define a local environment.

atlas.hcl
env "local" {
src = "./schema.hcl"
url = "mysql://root:pass@localhost:3306"
dev = "mysql://root:pass@localhost:3307"
}

Next, let's plan the initial migration that creates the users table:

atlas migrate diff --env local

Observe that the migrations/ directory was created with an .sql file and a file named atlas.sum:

├── atlas.hcl
├── migrations
│ ├── 20220714090139.sql
│ └── atlas.sum
└── schema.hcl

This is the contents of our new migration script:

-- add new schema named "example"
CREATE DATABASE `example`;
-- create "users" table
CREATE TABLE `example`.`users` (`id` int NOT NULL, `name` varchar(255) NOT NULL, PRIMARY KEY (`id`)) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

Next, let's make a destructive change to the schema. Destructive changes are changes to a database schema that result in loss of data, such as dropping a column or table. Let's remove the name name column from our desired schema:

schema.hcl
schema "example" {
}
table "users" {
schema = schema.example
column "id" {
type = int
}
// Notice the "name" column is missing.
primary_key {
columns = [
column.id
]
}
}

Now, let's plan a migration to this new schema:

atlas migrate diff --env local

Observe the new migration which Atlas planned for us:

-- modify "users" table
ALTER TABLE `example`.`users` DROP COLUMN `name`;

Finally, let's use atlas migrate lint to analyze this change and verify it's safety:

atlas migrate lint --env local --latest 1

Destructive changes detected in file 20220714090811.sql:

L2: Dropping non-virtual column "name"

When we run the lint command, we need to instruct Atlas on how to decide what set of migration files to analyze. Currently, two modes are supported.

  • --git-base <branchName>: which selects the diff between the provided branch and the current one as the changeset.
  • --latest <n> which selects the latest n migration files as the changeset.

As expected, Atlas analyzed this change and detected a destructive change to our database schema. In addition, Atlas users can analyze the migration directory to automatically detect:

  • Data-dependent changes
  • Migration Directory integrity
  • Backward-incompatible changes (coming soon)
  • Drift between the desired and the migration directory (coming soon)
  • .. and more

Wrapping up

We started Atlas more than a year ago because we felt that the industry deserves a better way to manage databases. A huge amount of progress has been made as part of the DevOps movement on the fronts of managing compute, networking and configuration. So much, in fact, that it always baffled us to see that the database, the single most critical component of any software system, did not receive this level of treatment.

Until today, the task of verifying the safety of migration scripts was reserved to humans (preferably SQL savvy, and highly experienced). We believe that with this milestone we are beginning to pave a road to a reality where teams can move as quickly and safely with their databases as they can with their code.

Have questions? Feedback? Find our team on our Discord server.

· 4 min read
Amit Shani

Today, I'm happy to announce the release of v0.4.2 of the Atlas CLI. This version includes many improvements and fixes, but I wanted to share with you exciting news about something I personally worked on. As of v0.4.2, Atlas includes preview support for CockroachDB 🎉

Atlas is an open-source project that helps developers to better manage their database schemas. It has a CLI tool and a Terraform integration. By using Atlas's Data Definition Language (with a syntax similar to Terraform), users can plan, verify and apply changes to their databases in a simple, declarative workflow. Earlier this year, Atlas became the migration engine for Ent, a widely popular, Linux Foundation backed entity framework for Go.

CockroachDB is an open-source NewSQL database. From their README:

CockroachDB is a distributed SQL database built on a transactional and strongly-consistent key-value store. It scales horizontally; survives disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention; supports strongly-consistent ACID transactions; and provides a familiar SQL API for structuring, manipulating, and querying data.

CockroachDB has been gaining popularity and many of you have been asking for Atlas to support it.

While CockroachDB aims to be PostgreSQL compatible, it still has some incompatibilities (e.g. 1, 2,3) which prevented Atlas users using the existing Postgres dialect from working with it.

With the latest release of Atlas, the Postgres driver automatically detects if it is connected to a CockroachDB database and uses a custom driver which provides compatability with CockroachDB.

Getting started with Atlas and CockroachDB

Let's see how we can use Atlas CLI to manage the schema of a CockroachDB database. Start by downloading the latest version of Atlas, on macOS:

brew install ariga/tap/atlas

For installation instructions on other platforms, see the docs.

For the purpose of this example, let's spin up a local, single-node CockroachDB cluster in a container by running:

docker run --rm -d -p 26257:26257 --name crdb cockroachdb/cockroach start-single-node --insecure

Next, let's seed the database with a simple table:

docker exec crdb cockroach sql --insecure -e 'CREATE TABLE users (id int primary key);'

After creating the users table, use Atlas's schema inspect command to read the schema of our local database and save the result to a file:

atlas schema inspect -u 'postgres://root:pass@localhost:26257/?sslmode=disable' --schema public > schema.hcl

Observe the current HCL representation of the public schema, which contains our newly created table, users:

table "users" {
schema = schema.public
column "id" {
null = false
type = bigint
}
primary_key {
columns = [column.id]
}
}
schema "public" {
}

Next, edit schema.hcl to add a column to the users table:

schema.hcl
table "users" {
schema = schema.public
column "id" {
null = false
type = bigint
}
column "name" {
type = varchar(100)
}
primary_key {
columns = [column.id]
}
}
schema "public" {
}

Now apply the schema using the schema apply command:

atlas schema apply -u 'postgres://root:pass@localhost:26257/?sslmode=disable' --schema public -f schema.hcl

Atlas prints out the planned changes and asks for your confirmation:

-- Planned Changes:
-- Create "test" table
ALTER TABLE "public"."users" ADD COLUMN "name" character varying(100) NOT NULL
? Are you sure?:
▸ Apply
Abort

After hitting "Apply", Atlas applies the desired schema to the database:

✔ Apply

We have successfully applied our schema to our database.

To stop the container running CockroachDB run:

docker stop crdb

Learn more about Atlas

In this short example, we demonstrated two of Atlas's basic features: database inspection and declarative schema migration (applying a desired schema on a database). Here are some topics you may want to explore when getting started with Atlas:

  • Learn the DDL - learn how to define any SQL resource in Atlas's data definition language.
  • Try the Terraform Provider - see how you can use the Atlas Terraform Provider to integrate schema management in your general Infrastructure-as-Code workflows.
  • Use the migrate command to author migrations - In addition to the Terraform-like declarative workflow, Atlas can manage a migration script directory for you based on your desired schema.

Preview support

The integration of Atlas with CockroachDB is well tested with version v21.2.11 (at the time of writing, latest) and will be extended in the future. If you're using other versions of CockroachDB or looking for help, don't hesitate to file an issue or join our Discord channel.

Have questions? Feedback? Find our team on our Discord server.

· 4 min read
Rotem Tamir

A few days ago we released v0.4.1 of Atlas. Along with a multitude of improvements and fixes, I'm happy to announce the release of a feature that we've been planning for a while: Project Files.

Project files provide a way to describe and interact with multiple environments while working with Atlas. A project file is a file named atlas.hcl that contains one or more env blocks, each describing an environment. Each environment has a reference to where the schema definition file resides, a database URL and an array of the schemas in the database that are managed by Atlas:

// Define an environment named "local".
env "local" {
// Declare where the schema definition file resides.
src = "./schema/project.hcl"

// Define the URL of the database which is managed in
// this environment.
url = "mysql://localhost:3306"

// Define the URL of the Dev Database for this environment.
// See: https://atlasgo.io/dev-database
dev = "mysql://localhost:3307"

// The schemas in the database that are managed by Atlas.
schemas = ["users", "admin"]
}

env "dev" {
// ... a different env
}

Project files arose from the need to provide a better experience for developers using the CLI. For example, consider you are using Atlas to plan migrations for your database schema. In this case, you will be running a command similar to this to plan a migration:

atlas migrate diff --dev-url mysql://root:password@localhost:3306 --to file://schema.hcl --dir file://migrations --format atlas

With project files, you can define an environment named local:

env "local" {
url = "mysql://root:password@localhost:3306"
dev = "mysql://root:password@localhost:3307"
src = "./schema.hcl"
migration {
dir = "file://migrations"
}
}

Then run the migrate diff command against this environment using the --env flag:

atlas migrate diff --env local

Alternatively, suppose you want to use Atlas to apply the schema on your staging environment. Without project files, you would use:

atlas schema apply -u mysql://root:password@db.ariga.dev:3306 --dev-url mysql://root:password@localhost:3307 -f schema.hcl

To do the same using a project file, define another env named staging:

env "staging" {
url = "mysql://root:password@db.ariga.dev:3306"
dev = "mysql://root:password@localhost:3307"
src = "./schema.hcl"
}

Then run:

atlas schema apply --env staging

Passing credentials as input values

Similar to schema definition files, project files also support Input Variables. This means that we can define variable blocks on the project file to declare which values should be provided when the file is evaluated. This mechanism can (and should) be used to avoid committing to source control database credentials. To do this, first define a variable named db_password:

variable "db_password" {
type = string
}

Next, replace the database password in all connection strings with a reference to this variable, for example:

env "staging" {
url = "mysql://root:${var.db_password}@db.ariga.dev:3306"
dev = "mysql://root:${var.db_password}@localhost:3307"
src = "./schema.hcl"
}

If we run schema apply without providing the password input variable, we will receive an error message:

Error: missing value for required variable "db_password"

To provide the input variable run:

atlas schema apply --env staging --var db_password=pass

Input variables can be used for many other use cases by passing them as input values to schema files.

What's next

In this post, I presented Project Files, a new feature recently added to Atlas to help developers create more fluent workflows for managing changes to their database schemas. In the coming weeks we will be adding a few more improvements to the dev flow, such as support for marking a specific environment as the default one (alleviating the need to specify --env in many cases) and multi-file schema definitions.

Have questions? Feedback? Find our team on our Discord server.

· 5 min read
Amit Shani

Today we are glad to announce the release of the official Atlas Terraform Provider.

What is Terraform

Terraform is a popular open-source tool created by HashiCorp, used to greatly simplify the task of provisioning and managing resources in the cloud. With Terraform, organizations can describe the desired state of their infrastructure in a simple configuration language and let Terraform plan and apply these changes in an automated way. This way, Terraform allows teams to truly deliver infrastructure-as-code (IaC), which completely change how teams and organizations manage their cloud infrastructure.

Infrastructure-as-Code and database management

Most cloud-native applications are backed by a database. The database is often the most critical part of many software systems, so making changes to its schema (structure and layout of the data inside) is a very risky business. However, schemas must evolve: as functionality changes over time, the backing tables are added, columns are dropped, indexes are created for performance reasons, and more.

Therefore it is surprising that there is no established way of integrating the management of schema changes (commonly called schema "migrations") into popular Infrastructure-as-Code workflows. For this reason, many organizations are running migrations from within the application code or using solutions outside the ecosystem of Terraform, meaning that management of the production environment is fragmented and hard to synchronize. Atlas aims to change that.

The Atlas Terraform provider allows you to synchronize your database with your desired schema, in a safe and stateful manner. By using Atlas’s core migration engine and embedding it in a Terraform provider, we are enabling teams to manage their database schemas as part of their full IaC workflow. This way, teams can use existing providers (such as AWS or GCP) to provision the database instance and use the Atlas provider to keep the schema in sync. Integrating Atlas with Terraform is especially useful because it couples the state of the infrastructure with the state of the database. It is also extremely neat when using a dev database, which is a feature that combines infrastructure and DB management to provide safety and correctness.

Demo

Prerequisites

Make sure you have installed:

Let’s see an example of the provider in action. First, spin a database using docker:

docker run -p 3306:3306 --name iloveatlas -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=market -d mysql:8

Great! Now we have an instance of MySQL database running.

As an extra measure of safety, we will run another identical database which will serve as a Dev Database. In short, the dev-db helps to catch errors that can only be detected when applying the schema. It is also useful to format the schema in a correct and predictable way. Read more about it here. Run a second instance of MySQL on another port, to serve as a dev-db:

docker run -p 3307:3306 --name devdb-greatness -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=market -d mysql:8

Next, we need an HCL file describing the desired state of our database. You can use atlas cli to inspect the state of another database or you can use the following basic schema:

schema.hcl
table "orders" {
schema = schema.market
column "id" {
null = false
type = int
auto_increment = true
}
column "name" {
null = false
type = varchar(20)
}
primary_key {
columns = [column.id]
}
}

schema "market" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
}

Save the schema file locally in a file named schema.hcl.

Now that we have our database schema we can use terraform to apply that schema to our database. Create a file named main.tf and copy the following snippet:

main.tf
terraform {
required_providers {
atlas = {
version = "~> 0.4.0"
source = "ariga/atlas"
}
}
}

provider "atlas" {}

data "atlas_schema" "market" {
dev_db_url = "mysql://root:pass@localhost:3307/market"
src = file("${path.module}/schema.hcl")
}

resource "atlas_schema" "market" {
hcl = data.atlas_schema.market.hcl
url = "mysql://root:pass@localhost:3306/market"
dev_db_url = "mysql://root:pass@localhost:3307/market"
}

Finally, init terraform:

terraform init

And apply the schema to the database by executing:

terraform apply --auto-approve

Awesome! Now your database should have a table named orders. To verify that we can connect to the database:

$ docker exec -it iloveatlas mysql -ppass --database=market

mysql> show tables;
+------------------+
| Tables_in_market |
+------------------+
| orders |
+------------------+
1 row in set (0.00 sec)

mysql> show create table orders;
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| orders | CREATE TABLE `orders` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

For more examples and documentation visit the official GitHub repository or the provider page on Terraform registry.

What's next

In this post, we presented the Atlas Terraform Provider. The provider currently supports the basic, declarative migration workflow that is available in the Atlas engine. In upcoming versions, we will add support for an additional kind of workflow that is supported by the engine and is called versioned migration authoring. In addition, more advanced safety checks (such as simulation on database snapshots) and migration strategies are also being worked on.

While the Terraform provider has just been released, the core engine that it is driving, is well tested and widely used (especially as the migration engine backing the popular Ent framework.) If you, like me, have always wanted to manage your database schema as part of your team's infrastructure-as-code workflow, give the Atlas Terraform provider a try!

Have questions? Feedback? Find our team on our Discord server.

· 10 min read

Database inspection is the process of connecting to a database to extract metadata about the way data is structured inside it. In this post, we will present some use cases for inspecting a database, demonstrate why it is a non-trivial problem to solve, and finally show how it can be solved using Atlas, an open-source package (and command-line tool) written in Go that we are maintaining at Ariga.

As an infrastructure engineer, I have wished many times to have a simple way to programmatically inspect a database. Database schema inspection can be useful for many purposes. For instance, you might use it to create visualizations of data topologies, or use it to find table columns that are no longer in use and can be deprecated. Perhaps you would like to automatically generate resources from this schema (such as documentation or GraphQL schemas), or to use to locate fields that might carry personally-identifiable information for compliance purposes. Whatever your use case may be, having a robust way to get the schema of your database is the foundation for many kinds of infrastructure applications.

When we started working on the core engine for Atlas, we quickly discovered that there wasn't any established tool or package that could parse the information schema of popular databases and return a data structure representing it. Why is this the case? After all, most databases provide some command-line tool to perform inspection. For example, psql, the standard CLI for Postgres, supports the \d command to describe a table:

postgres=# \d users;
Table "public.users"
Column | Type | Collation | Nullable | Default
--------+------------------------+-----------+----------+---------
id | integer | | not null |
name | character varying(255) | | |
Indexes:
"users_pkey" PRIMARY KEY, btree (id)

So what makes inspection a non-trivial problem to solve? In this post, I will discuss two aspects that I think are interesting. The first is the variance in how databases expose schema metadata and the second is the complexity of the data model that is required to represent a database schema.

How databases expose schema metadata

Most of the SQL that we use in day-to-day applications is pretty standard. However, when it comes to exposing schema metadata, database engines vary greatly in the way they work. The way to retrieve information about things like available schemas and tables, column types and their default values and many other aspects of the database schema looks completely different in each database engine. For instance, consider this query (source) which can be used to get the metadata about table columns from a Postgres database:

SELECT t1.table_name,
t1.column_name,
t1.data_type,
t1.is_nullable,
t1.column_default,
t1.character_maximum_length,
t1.numeric_precision,
t1.datetime_precision,
t1.numeric_scale,
t1.character_set_name,
t1.collation_name,
t1.udt_name,
t1.is_identity,
t1.identity_start,
t1.identity_increment,
t1.identity_generation,
col_description(to_regclass("table_schema" || '.' || "table_name")::oid, "ordinal_position") AS comment,
t2.typtype,
t2.oid
FROM "information_schema"."columns" AS t1
LEFT JOIN pg_catalog.pg_type AS t2
ON t1.udt_name = t2.typname
WHERE table_schema = $1
AND table_name IN (%s)
ORDER BY t1.table_name, t1.ordinal_position

As you can see, while it's definitely possible to get the needed metadata, information about the schema is stored in multiple tables in a way that isn't particularly well documented, and often requires delving into the actual source code to understand fully. Here's a query to get similar information from MySQL (source):

SELECT `TABLE_NAME`,
`COLUMN_NAME`,
`COLUMN_TYPE`,
`COLUMN_COMMENT`,
`IS_NULLABLE`,
`COLUMN_KEY`,
`COLUMN_DEFAULT`,
`EXTRA`,
`CHARACTER_SET_NAME`,
`COLLATION_NAME`
FROM `INFORMATION_SCHEMA`.`COLUMNS`
WHERE `TABLE_SCHEMA` = ?
AND `TABLE_NAME` IN (%s)
ORDER BY `ORDINAL_POSITION`

While this query is much shorter, you can see that it's completely different from the one we ran to inspect Postgres column metadata. This demonstrates just one way in inspecting Postgres is difference from inspecting MySQL.

Mapping database schemas into a useful data structure

To be a solid foundation for building infrastructure, inspection must produce a useful data structure that can be traversed and analyzed to provide insights, in other words, a graph representing the data topology. As mentioned above, such graphs can be used to create ERD (entity-relation diagram) charts, such as the schema visualizations on the Atlas Management UI:

Schema ERD open

Let's consider some aspects of database schemas that such a data structure should capture:

  • Databases are split into logical schemas.
  • Schemas contain tables, and may have attributes (such as default collation).
  • Tables contain columns, indexes and constraints.
  • Columns are complex entities that have types, that may be standard to the database engine (and version) or custom data types that are defined by the user. In addition, Columns may have attributes, such as default values, that may be a literal or an expression (it is important to be able to discern between now() and "now()").
  • Indexes contain references to columns of the table they are defined on.
  • Foreign Keys contain references to column in other tables, that may reside in other schemas.
  • ...and much, much more!

To capture any one of these aspects boils down to figuring out the correct query for the specific database engine you are working with. To be able to provide developers with a data structure that captures all of it, and to do it well across different versions of multiple database engines we've learned, is not an easy task. This is a perfect opportunity for an infrastructure project: a problem that is annoyingly complex to solve and that if solved well, becomes a foundation for many kinds of applications. This was one of our motivations for creating Atlas (GitHub) - an open-source project that we maintain here at Ariga.

Using Atlas, database schemas can be inspected to product Go structs representing a graph of the database schema topology. Notice the many cyclic references that make it hard to print (but very ergonomic to travere :-)):

&schema.Realm{
Schemas: {
&schema.Schema{
Name: "test",
Tables: {
&schema.Table{
Name: "users",
Schema: &schema.Schema{(CYCLIC REFERENCE)},
Columns: {
&schema.Column{
Name: "id",
Type: &schema.ColumnType{
Type: &schema.IntegerType{
T: "int",
Unsigned: false,
},
Null: false,
},
},
},
PrimaryKey: &schema.Index{
Unique: false,
Table: &schema.Table{(CYCLIC REFERENCE)},
Attrs: nil,
Parts: {
&schema.IndexPart{
SeqNo: 0,
Desc: false,
C: &schema.Column{(CYCLIC REFERENCE)},
},
},
},
},
&schema.Table{
Name: "posts",
Schema: &schema.Schema{(CYCLIC REFERENCE)},
Columns: {
&schema.Column{
Name: "id",
Type: &schema.ColumnType{
Type: &schema.IntegerType{
T: "int",
Unsigned: false,
},
Null: false,
},
},
&schema.Column{
Name: "author_id",
Type: &schema.ColumnType{
Type: &schema.IntegerType{
T: "int",
Unsigned: false,
},
Null: true,
},
},
},
PrimaryKey: &schema.Index{
Unique: false,
Table: &schema.Table{(CYCLIC REFERENCE)},
Parts: {
&schema.IndexPart{
SeqNo: 0,
Desc: false,
C: &schema.Column{(CYCLIC REFERENCE)},
},
},
},
ForeignKeys: {
&schema.ForeignKey{
Symbol: "owner_id",
Table: &schema.Table{(CYCLIC REFERENCE)},
Columns: {
&schema.Column{(CYCLIC REFERENCE)},
},
RefTable: &schema.Table{(CYCLIC REFERENCE)},
RefColumns: {
&schema.Column{(CYCLIC REFERENCE)},
},
OnDelete: "SET NULL",
},
},
},
},
},
},
}

Inspecting databases in Go using Atlas

While Atlas is commonly used as a CLI tool, all of Atlas's core-engine capabilities are available as a Go module that you can use programmatically. Let's get started with database inspection in Go:

To install Atlas, use:

go get ariga.io/atlas@master

Drivers

Atlas currently supports three core capabilities for working with SQL schemas.

  • "Inspection" - Connecting to a database and understanding its schema.
  • "Plan" - Compares two schemas and produces a set of changes needed to reconcile the target schema to the source schema.
  • "Apply" - creates concrete set of SQL queries to migrate the target database.

In this post we will dive into the inspection with Atlas. The way inspection is done varies greatly between the different SQL databases. Atlas currently has four supported drivers:

  • MySQL
  • MariaDB
  • PostgreSQL
  • SQLite

Atlas drivers are built on top of the standard library database/sql package. To initialize the different drivers, we need to initialize a sql.DB and pass it to the Atlas driver constructor. For example:

package tutorial

import (
"database/sql"
"log"
"testing"
_ "github.com/mattn/go-sqlite3"
"ariga.io/atlas/sql/schema"
"ariga.io/atlas/sql/sqlite"
)

func Test(t *testing.T) {
// Open a "connection" to sqlite.
db, err := sql.Open("sqlite3", "file:example.db?cache=shared&_fk=1&mode=memory")
if err != nil {
log.Fatalf("failed opening db: %s", err)
}
// Open an atlas driver.
driver, err := sqlite.Open(db)
if err != nil {
log.Fatalf("failed opening atlas driver: %s", err)
}
// ... do stuff with the driver
}

Inspection

As we mentioned above, inspection is one of Atlas's core capabilities. Consider the Inspector interface in the sql/schema package:

package schema

// Inspector is the interface implemented by the different database
// drivers for inspecting multiple tables.
type Inspector interface {
// InspectSchema returns the schema description by its name. An empty name means the
// "attached schema" (e.g. SCHEMA() in MySQL or CURRENT_SCHEMA() in PostgreSQL).
// A NotExistError error is returned if the schema does not exists in the database.
InspectSchema(ctx context.Context, name string, opts *InspectOptions) (*Schema, error)

// InspectRealm returns the description of the connected database.
InspectRealm(ctx context.Context, opts *InspectRealmOption) (*Realm, error)
}

As you can see, the Inspector interface provides methods for inspecting on different levels:

  • InspectSchema - provides inspection capabilities for a single schema within a database server.
  • InspectRealm - inspects the entire connected database server.

Each database driver (for example MySQL, Postgres or SQLite) implements this interface. Let's see how we can use this interface by inspecting a "dummy" SQLite database. Continuing on the example from above:

package tutorial

func TestInspect(t *testing.T) {
// ... skipping driver creation
ctx := context.Background()
// Create an "example" table for Atlas to inspect.
_, err = db.ExecContext(ctx, "create table example ( id int not null );")
if err != nil {
log.Fatalf("failed creating example table: %s", err)
}
// Open an atlas driver.
driver, err := sqlite.Open(db)
if err != nil {
log.Fatalf("failed opening atlas driver: %s", err)
}
// Inspect the created table.
sch, err := driver.InspectSchema(ctx, "main", &schema.InspectOptions{
Tables: []string{"example"},
})
if err != nil {
log.Fatalf("failed inspecting schema: %s", err)
}
tbl, ok := sch.Table("example")
require.True(t, ok, "expected to find example table")
require.EqualValues(t, "example", tbl.Name)
id, ok := tbl.Column("id")
require.True(t, ok, "expected to find id column")
require.EqualValues(t, &schema.ColumnType{
Type: &schema.IntegerType{T: "int"}, // An integer type, specifically "int".
Null: false, // The column has NOT NULL set.
Raw: "INT", // The raw type inspected from the DB.
}, id.Type)
}

The full source-code for this example is available in the atlas-examples repo .

And voila! In this example, we first created a table named "example" by executing a query directly against the database. Next, we used the driver's InspectSchema method to inspect the schema of the table we created. Finally, we made some assertions on the returned schema.Table instance to verify that it was inspected correctly.

Inspecting using the CLI

If you don't want to write any code and just want to get a document representing your database schema, you can always use the Atlas CLI to do it for you. To get started, head over to the docs.

Wrapping up

In this post we presented the Go API of Atlas, which we initially built around our use case of building a new database migration tool, as part of the Operational Data Graph Platform that we are creating here at Ariga. As we mentioned in the beginning of this post, there are a lot of cool things you can build if you have proper database inspection, which raises the question, what will you build with it?

Getting involved with Atlas

· 9 min read

Last week we released v0.3.2 of the Atlas CLI.

Atlas is an open source tool that helps developers manage their database schemas. Atlas plans database migrations for you based on your desired state. The two main commands are inspect and apply. The inspect command inspects your database and the apply command runs a migration by providing an HCL document with your desired state.

The most notable change in this version is the ability to interact with multiple schemas in both database inspection and migration (the apply command).

Some other interesting features include:

  • schema apply --dry-run - running schema apply in dry-run mode connects to the target database and prints the SQL migration to bring the target database to the desired state without prompting the user to approve it.
  • schema fmt - adds basic formatting capabilities to .hcl files.
  • schema diff - Connects to two given databases, inspects them, calculates the difference in their schemas, and prints a plan of SQL statements needed to migrate the "from" database to the state of the "to" database.

In this post we will explore the topic of multi-schema support. We will start our discussion with a brief explanation of database schemas, next we'll present the difference between how MySQL and PostgreSQL treat "schemas". We will then show how the existing schema inspect and schema apply commands work with multi-schema support, and wrap up with some plans for future releases.

What is a database schema?

Within the context of relational (SQL) databases, a database schema is a logical unit within a physical database instance (server/cluster) that forms a namespace of sorts. Inside each schema you can describe the structure of the tables, relations, indexes and other attributes that belong to it. In other words, the database schema is a "blueprint" of the data structure inside a logical container (Note: in Oracle databases a schema is linked to the user, so it carries a different meaning which is out of scope for this post). As you can guess from the title of this post, many popular relational databases allow users to host multiple (logical) schemas on the same (physical) database.

Where are database schemas used in practice?

Why is this level of logical division necessary? Isn't it enough to be able physically split data into different database instances? In my career, I've seen multiple scenarios in which organizations opt to split a database into multiple schemas.

First, grouping different parts of your application into logical units makes it simpler to reason about and govern. For instance, it is possible to create multiple user accounts in our database and give each of them permission to access a subset of the schemas in the database. This way, each user can only touch the parts of the database they need, preventing the practice of creating an almighty super-user account that has no permission boundary.

An additional pattern I've seen used, is in applications with a multi-tenant architecture where each tenant has its own schema with the same exact table structure (or some might have a different structure since they use different versions of the application). This pattern is used to create a stronger boundary between the different tenants (customers) preventing the scenario where one tenant accidentally has access to another's data that is incidentally hosted on the same machine.

Another useful feature of schemas is the ability to divide the same server into different environments for different development states. For example, you can have a "dev" and "staging" schema inside the same server.

What are the differences between schemas in MySQL and PostgreSQL?

A common source of confusion for developers (especially when switching teams or companies) is the difference between the meaning of schemas in MySQL and PostgreSQL. Both are currently supported by Atlas, and have some differences that should be clarified.

Looking at the MySQL glossary, it states:

"In MySQL, physically, a schema is synonymous with a database. You can substitute the keyword SCHEMA instead of DATABASE in MySQL SQL syntax, for example using CREATE SCHEMA instead of CREATE DATABASE"

As we can see, MySQL doesn't distinguish between schemas and databases in the terminology, but the underlying meaning is still the same - a logical boundary for resources and permissions.

To demonstrate this, open your favorite MySQL shell and run:

mysql> create schema atlas;
Query OK, 1 row affected (0.00 sec)

To create a table in our new schema, assuming we have the required permissions, we can switch to the context of the schema that we just created, and create a table:

USE atlas;
CREATE table some_name (
id int not null
);

Alternatively, we can prefix the schema, by running:

CREATE TABLE atlas.cli_versions
(
id bigint auto_increment primary key,
version varchar(255) not null
);

This prefix is important since, as we said, schemas are logical boundaries (unlike database servers). Therefore, we can create references between them using foreign keys from tables in SchemaA to SchemaB. Let's demonstrate this by creating another schema with a table and connect it to a table in the atlas schema:

CREATE SCHEMA atlantis;

CREATE TABLE atlantis.ui_versions
(
id bigint auto_increment
primary key,
version varchar(255) not null,
atlas_version_id bigint null,
constraint ui_versions_atlas_version_uindex
unique (atlas_version_id)
);

Now let's link atlantis to atlas:

alter table atlantis.ui_versions
add constraint ui_versions_cli_versions_id_fk
foreign key (atlas_version_id) references atlas.cli_versions (id)
on delete cascade;

That's it! We've created 2 tables in 2 different schemas with a reference between them.

How does PostgreSQL treat schemas?

When booting a fresh PostgreSQL server, we get a default logical schema called "public". If you wish to split your database into logical units as we've shown with MySQL, you can create a new schema:

CREATE SCHEMA atlas;

Contrary to MySQL, Postgres provides an additional level of abstraction: databases. In Postgres, a single physical server can host multiple databases. Unlike schemas (which are basically the same as in MySQL) - you can't reference a table from one PostgreSQL database to another.

In Postgres, the following statement will create an entirely new database, where we can place different schemas and tables with that may contain references between them:

create database releases;

When we run this statement, the database will be created with the default Postgres metadata tables and the default public schema.

In Postgres, you can give permissions to an entire database(s), schema(s), and/or table(s), and of course other objects in the Postgres schema.

Another distinction from MySQL is that in addition to sufficient permissions, a user must have the schema name inside their search_path in order to use it without a prefix.

To sum up, both MySQL and Postgres allow the creation of separate logical schemas within a physical database server, schemas can refer to one another via foreign-keys. PostgreSQL supports an additional level of separation by allowing users to create completely different databases on the server.

Atlas multi-schema support

As we have shown, having multiple schemas in the same database is a common scenario with popular relational databases. Previously, the Atlas CLI only supported inspecting or applying changes to a single schema (even though this has been long supported in the Go API). With this release, we have added support for inspecting and applying multiple schemas with a single .hcl file.

Next, let's demonstrate how we can use the Atlas CLI to inspect and manage a database with multiple schemas.

Start by downloading and installing the latest version of the CLI. For the purpose of this demo, we will start with a fresh database of MySQL running in a local docker container:

docker run --name atlas-db  -p 3306:3306 -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=example mysql:8

By passing example in the MYSQL_DATABASE environment variable a new schema named "example" is created. Let's verify this by using the atlas schema inspect command. In previous versions of Atlas, users had to specify the schema name as part of the DSN for connecting to the database, for example:

atlas schema inspect -u "mysql://root:pass@localhost:3306/example"

Starting with v0.3.2, users can omit the schema name from the DSN to instruct Atlas to inspect the entire database. Let's try this:

$ atlas schema inspect -u "mysql://root:pass@localhost:3306/" > atlas.hcl
cat atlas.hcl
schema "example" {
charset = "utf8mb4"
collation = "utf8mb4_0900_ai_ci"
}

Let's verify that this works correctly by editing the atlas.hcl that we have created above and adding a new schema:

schema "example" {
charset = "utf8mb4"
collation = "utf8mb4_0900_ai_ci"
}
schema "example_2" {
charset = "utf8mb4"
collation = "utf8mb4_0900_ai_ci"
}

Next, we will use the schema apply command to apply our changes to the database:

atlas schema apply -u "mysql://root:pass@localhost:3306/" -f atlas.hcl

Atlas plans a migration to add the new DATABASE (recall that in MySQL DATABASE and SCHEMA are synonymous) to the server, when prompted to approve the migration we choose "Apply":

-- Planned Changes:
-- Add new schema named "example_2"
CREATE DATABASE `example_2`
✔ Apply

To verify that schema inspect works properly with multiple schemas, lets re-run:

atlas schema inspect -u "mysql://root:pass@localhost:3306/"

Observe that both schemas are inspected:

schema "example" {
charset = "utf8mb4"
collation = "utf8mb4_0900_ai_ci"
}
schema "example_2" {
charset = "utf8mb4"
collation = "utf8mb4_0900_ai_ci"
}

To learn more about the different options for working with multiple schemas in inspect and apply commands, consult the CLI Reference Docs.

What's next for multi-schema support?

I hope you agree that multi-schema support is a great improvement to the Atlas CLI, but there is more to come in this area. In our previous blogpost we have shared that Atlas also has a Management UI (-w option in the CLI) and multi-schema support is not present there yet - stay tuned for updates on multi-schema support for the UI in an upcoming release!

Getting involved with Atlas

· 5 min read

Earlier this week we released v0.3.0 of the Atlas CLI. This version features a ton of improvements to database inspection, diffing and migration planning. You can read about those in the release notes page, but we wanted to take the time and introduce the biggest feature in this release, the Atlas Management UI.

To recap, Atlas is an open source CLI tool that helps developers manage their database schemas. Contrary to existing tools, Atlas intelligently plans schema migrations for you, based on your desired state. Atlas currently has two main commands: inspect and apply. The inspect command inspects your database, generating an Atlas HCL document. The apply command allows you to migrate your schema from its current state in the database to your desired state by providing an HCL file with the relevant schema.

In this post we will showcase the latest addition to the CLI's feature set, the Management UI. Until now, you could use Atlas to manage your schemas via your terminal. While this is the common interface for many infrastructure management workflows, we believe that a visual, integrated environment can be beneficial in many use-cases.

Inspecting our database using the Atlas UI

Let's see how we can use the Atlas UI to inspect our database.

For the purpose of demonstration let's assume that you have a locally running MySQL database. If you want to follow along, check out the Setting Up tutorial on the Atlas website for instructions on starting up a MySQL database locally using Docker.

We will be working with a MySQL database that has the following tables:

CREATE table users (
id int PRIMARY KEY,
name varchar(100)
);
CREATE TABLE blog_posts (
id int PRIMARY KEY,
title varchar(100),
body text,
author_id int,
FOREIGN KEY (author_id) REFERENCES users(id)
);

To inspect the database, we can use the atlas schema inspect command. Starting with this version, we can add the -w flag to open the (local) web UI:

atlas schema inspect -u "mysql://root:pass@localhost:3306/example" -w

Our browser will open automatically, and we should see this output in the CLI:

Atlas UI available at: http://127.0.0.1:5800/projects/25769803777/schemas/1
Press Ctrl+C to stop

inspect_image

We can see that our schema has been inspected, and that it's currently synced. On the bottom-left part of the screen the UI displays an ERD (Entity-relation Diagram) showing the different tables and the connections between them (via foreign-keys). On the bottom-right, we can see the current schema, described using the Atlas DDL. In addition, on the top-right, we see the "Activity & History" panel that holds an audit history for all changes to our schema.

Migrating our database schema with the Atlas Management UI

Visualizing the current schema of the database is great, let's now see how we can use the UI to initiate a change (migration) to our schema.

Click on the Edit Schema button in the top-right corner and add the following two tables to our schema:

table "categories" {
schema = schema.example
column "id" {
null = false
type = int
}
column "name" {
null = true
type = varchar(100)
}
primary_key {
columns = [table.categories.column.id, ]
}
}

table "post_categories" {
schema = schema.example
column "post_id" {
type = int
}
column "category_id" {
type = int
}
foreign_key "post_category_post" {
columns = [table.post_categories.column.post_id, ]
ref_columns = [table.blog_posts.column.id, ]
}
foreign_key "post_category_category" {
columns = [table.post_categories.column.category_id, ]
ref_columns = [table.categories.column.id, ]
}
}

Click the Save button and go back to the schema page. Observe that a few things changed on the screen:

The UI after saving

First, we can see that the UI states that our schema is "Out of Sync". This is because there is a difference between our desired schema, the one we are currently working on, and the inspected schema, which is the actual, current schema of our database.

Second, we can see that our ERD has changed reflecting the addition of the categories and post_categories tables to our schema. These two tables that have been added are now shown in green. By clicking the "expand" icon on the top-right corner of the ERD panel, we can open a more detailed view of our schema.

ERD displaying diff

Going back to our schema page, click the "Migrate Schema" to initiate a migration to apply the changes we want to make to our schema. Next, Atlas will setup the migration. Click "Plan Migration" to see the migration plan to get to the desired schema:

Migration Prep

Atlas displays the diff in the schema in HCL on the left pane, and the planned SQL statements on the right. Click "Apply Migration" to begin executing the plan.

Migration Plan

In the final screen of the migration flow, Atlas displays informative logs about the migration process. In this case, our migration completed successfully! Let's click "Done" to return to the schema detail page.

Applying Migration

As expected, after executing our migration plan, our database and desired schema are now synced!

Post Migrations

Wrapping Up

In this post, we've introduced the Atlas Management UI and showed one of the possible workflows that are supported in it. There's much more inside, and we invite you to install it today and give it a try.

What next?

· 7 min read

At Ariga, we are building a new kind of platform that we call an Operational Data Graph. This platform enables software engineers to manage, maintain and access complex data architectures as if they were one database. Today, we are open-sourcing a CLI for Atlas, one of the fundamental building blocks of our platform.

During my career, the scope of what is expected of me as a software engineer has increased significantly. Developers are no longer expected just to write code, we are expected to provision infrastructure, manage databases, define deployments and monitor systems in production.

Nowadays, one of the responsibilities we have as software engineers is to manage the database schema of our applications. Once seen as falling strictly under the domain of DBAs, today developers everywhere are responsible for defining database schemas and changing them over time. Because an application's database carries its state, all clients and servers are severely impacted if it stops functioning properly. Therefore, over the years many techniques and tools were developed to deal with this process, which is called migrating the database.

In the last few years we have seen a lot of progress in the field of tools for provisioning infrastructure. From early projects such as Chef and Puppet, to more recent work such as Terraform, a lot of thought and effort has been put across the industry to build tools that simplify and standardize the process. Instead of manually installing and configuring software and services, the common thread between all of these projects is that they are based on machine-readable definition files, a concept also known as infrastructure-as-code (IaC).

Enter: Atlas

Atlas is at the core of Ariga's platform. In this post, I would like to share with you the work we've done so far to provide a solid foundation for managing databases in a way that's akin to infrastructure-as-code practices.

  • The Atlas DDL (Data-definition Language): we have created the Atlas DDL, a new configuration language designed to capture an organization's data topology - including relational database schemas. This language is currently described in an HCL syntax (similar to TerraForm), but will support more syntaxes such as JSON and TypeScript in the future. The Atlas DDL currently supports defining schemas for SQL databases such as MySQL, Postgres, SQLite and MariaDB, but in the future, we plan to add support for other types of databases. For example:
table "users" {
schema = "default"
column "id" {
type = "int"
}
column "name" {
type = "string"
}
column "manager_id" {
type = "int"
}
primary_key {
columns = [
table.users.column.id
]
}
index "idx_name" {
columns = [
table.users.column.name
]
unique = true
}
foreign_key "manager_fk" {
columns = [table.users.column.manager_id]
ref_columns = [table.users.column.id]
on_delete = "CASCADE"
on_update = "NO ACTION"
}
}
  • The Atlas CLI On top of the building blocks provided by the DDL, we started building our CLI tool to support the two most basic functions:

    • "Schema Inspect" - Create a schema specification file from a database.
    • "Schema Apply" - Migrate a database to a new desired state.

Many infrastructure-as-code projects have taken the declarative approach, in which the developer articulates the desired state of the system and the tool is responsible for figuring out a plan to get there. As we discussed above, changing database schemas safely is a delicate practice, so we had to build the Atlas CLI to be smart enough to understand the nuance of changes for each type of database.

Atlas in action

Let's see how Atlas CLI works with real databases. Let's start a MySQL container:

docker run --name atlas-db  -p 3306:3306  -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=example   mysql:8.0.27

Connect to our database using a native client to validate:

docker  exec -it  atlas-db  mysql --password='pass' example
mysql> show tables;
Empty set (0.00 sec)

mysql>

Let's see how Atlas inspects it:

atlas schema inspect -u "mysql://root:pass@localhost:3306/example" > atlas.hcl

As expected, an empty schema:

# cat atlas.hcl
schema "example" {
}

Let's update our schema to:

# cat atlas.hcl
table "users" {
schema = "example"
column "id" {
null = false
type = "int"
}
column "name" {
null = false
type = "string"
size = 255
}
column "manager_id" {
null = false
type = "int"
}
primary_key {
columns = [table.users.column.id, ]
}
foreign_key "manager_fk" {
columns = [table.users.column.manager_id, ]
ref_columns = [table.users.column.id, ]
on_update = "NO ACTION"
on_delete = "CASCADE"
}
index "idx_name" {
unique = true
columns = [table.users.column.name, ]
}
index "manager_fk" {
unique = false
columns = [table.users.column.manager_id, ]
}
}
schema "example" {
}

And apply our changes!

atlas schema apply -u "mysql://root:pass@localhost:3306/example" -f atlas.hcl



-- Planned Changes:
-- Add Table : users
CREATE TABLE `example`.`users` (`id` int NOT NULL, `name` varchar(255) NOT NULL, `manager_id` int NOT NULL, PRIMARY KEY (`id`), UNIQUE INDEX `idx_name` (`name`), CONSTRAINT `manager_fk` FOREIGN KEY (`manager_id`) REFERENCES `example`.`users` (`id`) ON UPDATE NO ACTION ON DELETE CASCADE) ;
Use the arrow keys to navigate: ↓ ↑ → ←
? Are you sure?:
▸ Apply
Abort

Of course we are sure !

Using CLI to examine our database:

mysql> describe users;
+------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| id | int | NO | PRI | NULL | |
| name | varchar(255) | NO | UNI | NULL | |
| manager_id | int | NO | MUL | NULL | |
+------------+--------------+------+-----+---------+-------+
3 rows in set (0.00 sec)

mysql>

Let's make sure that it has the FK:

mysql> show create table users;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| users | CREATE TABLE `users` (
`id` int NOT NULL,
`name` varchar(255) NOT NULL,
`manager_id` int NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `idx_name` (`name`),
KEY `manager_fk` (`manager_id`),
CONSTRAINT `manager_fk` FOREIGN KEY (`manager_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

Now let's see that Atlas inspects this correctly:

atlas schema inspect -u "mysql://root:pass@localhost:3306/example" > atlas.hcl
# cat atlas.hcl
table "users" {
schema = "example"
column "id" {
null = false
type = "int"
}
column "name" {
null = false
type = "string"
size = 255
}
column "manager_id" {
null = false
type = "int"
}
primary_key {
columns = [table.users.column.id, ]
}
foreign_key "manager_fk" {
columns = [table.users.column.manager_id, ]
ref_columns = [table.users.column.id, ]
on_update = "NO ACTION"
on_delete = "CASCADE"
}
index "idx_name" {
unique = true
columns = [table.users.column.name, ]
}
index "manager_fk" {
unique = false
columns = [table.users.column.manager_id, ]
}
}
schema "example" {
}

Let's see what happens when we try to reapply the same change:

atlas schema apply -u "mysql://root:pass@localhost:3306/example" -f atlas.hcl
Schema is synced, no changes to be made

In this example we have shown how we can inspect a MySQL database schema and apply a change.

What's Next?

The Atlas DDL opens up a world of tools and services, and with the help of our community, we are planning to push the development ecosystem forward. A list of tools that are on our road map includes:

  • Integrations with Terraform, GitHub actions and Kubernetes.
  • Extended migration logics such as renaming columns, adding or dropping nullability and altering enums.
  • Toolsets for examining the migration history and reproducing it.

We hope that you find Atlas CLI as exciting as we do, and we invite you to contribute your ideas and code.