Skip to main content

5 posts tagged with "terraform"

View All Tags

· 11 min read
Rotem Tamir

The Evolution of Database Schema Management

In today's world, where software is developed and deployed at a rapid pace, selecting the right tools for your team is crucial. The right tools can help you move faster, be more productive, and reduce the risk of errors and bugs. Most backend applications are backed by a database, and maintaining the database schema is a critical part of the development process.

Our industry has seen many trends in the way database schemas evolve over the years. In the early days, DBAs would manually run SQL DDL commands directly against the database through a terminal. Then came GUI based, point-and-click database management tools that included schema editing capabilities.

In the early 2000s, we saw the rise of "database migration tools" that allowed developers to write versioned database upgrade scripts (usually in SQL), called "migrations", that would be executed by the tool itself. This trend was further intensified with the rise of DevOps and CI/CD, as it became the de-facto standard way for automating database schema management.

The Rise of Database Schema-as-Code

In recent years, we have witnessed a seismic shift in the way cloud infrastructure is managed. The rise of tools such as Terraform and Kubernetes has led to the concept of "Infrastructure-as-Code". As the complexity of infrastructure grew, it became clear that the only way to manage it effectively was to use a declarative approach and treat it "as code".

Infrastructure-as-code, in this context, means that the desired state of the infrastructure is defined in a formal language ("as code") and stored in a version control system (such as Git). The infrastructure is then managed by a tool that compares the desired state to the actual state and makes the necessary changes to bring the actual state to the desired state.

This approach has many advantages over the traditional imperative approach of manually running commands against the infrastructure. It allows for better collaboration between team members, it is more robust and less susceptible to errors, it is self documenting, and allows for better visibility, governance and control into the state of the infrastructure.

Following this trend, some projects such as Atlas and Skeema have started to apply the same principles to database schema management. Instead of using an imperative approach, where developers do the planning and tools are only used to execute the plan and keep track of what has been done, they bring to the table a simpler approach. This approach, which is gaining the name "Database Schema-as-Code", applies a declarative approach to database schema management: developers provide the desired state of the database and the tool automatically plans the required migrations.

A practical example

Let's use Atlas to show a quick example of how database schema-as-code tools work.

Atlas allows users to represent their database schema in plain SQL or in HCL. The schema is saved in a schema file, which holds the desired state of our database.

We will start with a simple schema that represents a users table, in which each user has an ID and a name:

schema.hcl
table "users" {
schema = schema.example
column "id" {
null = false
type = int
}
column "name" {
null = true
type = varchar(100)
}
primary_key {
columns = [column.id]
}
}

Now, let's apply this schema to an empty MySQL database:

atlas schema apply \
-u "mysql://root:pass@localhost:3306/example" \
--to file://schema.hcl \
--dev-url "docker://mysql/8/example"

Atlas will compare the desired state from our schema file to the database URL provided, and plan the migration:

-- Planned Changes:
-- Create "users" table
CREATE TABLE `users` (
`id` int NOT NULL,
`name` varchar(100) NULL,
PRIMARY KEY (`id`)
) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
? Are you sure?:
▸ Apply
Lint and edit
Abort

After applying this to our database, we will run the command once more and see that Atlas recognizes there is no drift between the two states:

Schema is synced, no changes to be made

Let's make a simple change to our schema by adding an email column:

schema.hcl
table "users" {
schema = schema.example
column "id" {
null = false
type = int
}
column "name" {
null = true
type = varchar(100)
}
column "email" {
null = true
type = varchar(255)
}
primary_key {
columns = [column.id]
}
}

Now let's run the schema apply command again, and view the migration plan that Atlas will create for us:

-- Planned Changes:
-- Modify "users" table
ALTER TABLE `users` ADD COLUMN `email` varchar(255) NULL;
? Are you sure?:
▸ Apply
Lint and edit
Abort

To conclude, in this example, we got a glimpse of how a tool like Atlas works with a database schema using the declarative approach.

Do you need a Database Schema-as-Code tool?

If you are using an older more traditional database migration tool or not using one at all, you might be wondering if you should consider switching to a Database Schema-as-Code tool. Here are five signs that you might need to consider it:

1. Your team is making changes to the database schema directly in production

The worst kind of tool for managing database schemas is no tool at all. Running migrations manually against production databases should be avoided at all costs. First, humans are notoriously prone to errors, and the risk of making a mistake that will cause downtime or data loss is very high. Second, any changes made directly in production are not tracked anywhere, so it is very hard to know what has been done and by whom. Third, it means that your deployments cannot be fully automated and you are missing out on the benefits of CI/CD.

Using a Database Schema-as-Code tool as the exclusive way of making changes to the database schema is a great way to avoid these problems. It allows you to automate your deployments, track all changes to the database schema in source control, and collaborate more effectively with your team.

2. You don't have a single source of truth for your database schema

Having a single source of truth for how your application is deployed is a key principle of DevOps. It allows you to have a clear picture of the state of your application at any given time, and it allows you to automate your deployments and avoid human errors.

By requiring all migrations to be checked in to source control, traditional migration tools can go a long way to ensure that the database schema is in a consistent state, but they suffer from two issues in this regard:

  1. They only keep track of the changes to the database ("migrations"). To figure out the actual schema of the database at any given version, you need to apply all the migrations up to that version to an empty database and inspect the result with an additional tool. Database Schema-as-Code tools, on the other hand, naturally document the schema of the database in any given version.

  2. They have no way to ensure that the actual database's schema is consistent with the migrations. If someone makes a change directly in the database, there is no way to know about it. When you apply your schema to a target database using a Database Schema-as-Code tool, it will compare the desired state to the actual state and make the necessary changes to bring the actual state to the desired state.

    This ensures that over time your actual databases will not drift from the desired state.

3. You don't have any database experts on your team

In the past, managing database schemas was under the responsibility of DBAs. They were the ones who were responsible for making sure that the database schema is consistent and that all changes are done safely.

In recent years, as cloud-native, microservices-based architectures have become more popular, the amount of databases that each team manages has grown significantly. This has led to a situation where many teams don't have a DBA , and the responsibility for managing the database schema falls on the shoulders of the developers.

If you maintain a small schema with a few tables, you might not feel the need for a tool to help you manage it. But as schemas grow in size and complexity, to contain thousands of database objects (tables, views, functions, stored procedures, etc.), it becomes very hard to manage them manually. A Database Schema-as-Code tool can help you manage your schema more effectively, even if you don't have any DBAs on your team.

4. You manually verify the safety of your database migrations

If your application is small and the stakes for making a mistake are low, you might be able to get away with messing up a schema change now and then. But if your workload is mission-critical, if application downtime or data loss can have a significant impact on your business, you need to make sure that your migrations are safe.

Many teams have been through the unpleasant experience of a failed migration that caused a significant outage. This is usually followed by a change in review processes where migrations are reviewed by multiple team members, and a manual verification process is added to make sure that the migration is safe to run.

This process is time-consuming and error-prone. Even worse, it turns team members with more database experience into bottlenecks, slowing down the development process and team velocity.

Database Schema-as-Code tools can help you automate this process and make it more robust. Atlas, for example, comes with built-in support for Migration Linting, Diff Policies, and other safety mechanisms that help you catch errors and issues before they lead to a full-blown production outage.

5. Your app and infra delivery are way better than your database deployments

When interviewing platform engineers about how their deployment infrastructure works, we keep hearing the same story: "Our deployments are 100% pure Infrastructure-as-Code GitOps goodness, we can do them 50 times a day!" And when we ask about databases… "Oh yeah, there’s that part... It’s not as great."

A lot has changed in the way applications are deployed since the early days of database migrations. The versioned migrations paradigm was a great improvement over the previous manual approach, but it is still very much an imperative approach. Modern application delivery principles such as GitOps and Infrastructure-as-Code work in a declarative way, and so they need a database schema management tool that works in a similar way.

If your application and infra delivery feel like a well-oiled machine, but your database deployments are still semi-manual, or feel clunky and error-prone, you should consider switching to a Database Schema-as-Code tool where you can manage your database schema in a Kubernetes Operator or a Terraform Provider.

Enter: Atlas

Atlas is a Database Schema-as-Code tool that allows you to manage your database schema in a declarative way. With Atlas, developers provide the desired state of the database schema and Atlas automatically plans the required migrations. To get started with Atlas, you can follow our quickstart guide..

Wrapping up

In this article, we have discussed the evolution of database schema management tools and the rise of Database Schema-as-Code tools. We have also discussed five signs that you might need to consider switching to a Database Schema-as-Code tool. I hope you found this article useful.

As always, we would love to hear your feedback and suggestions on the Atlas Discord server.

· 11 min read
Rotem Tamir

Introduction

When we started building Atlas a couple of years ago, we noticed that there was a substantial gap between what was then considered state-of-the-art in managing database schemas and the recent strides from Infrastructure-as-Code (IaC) to managing cloud infrastructure.

In this post, we review that gap and show how Atlas – along with its Terraform provider – can bridge the two domains.

As an aside, I usually try to keep blog posts practical and to the point, but occasionally think it’s worth it to zoom out and explain the grander ideas behind what we do.

If you’re looking for a quick and practical explanation of working with Atlas and Terraform, I recommend this YouTube video.

Why Infrastructure-as-Code

Infrastructure as Code (IaC) refers to the practice of managing and provisioning infrastructure through machine-readable configuration files, instead of utilizing traditional interactive configuration tools. This approach makes for automated, consistent, and repeatable deployment of environments that are faster and less error-prone than previous, more manual approaches.

Terraform, a popular open-source tool created by HashiCorp, is the most prominent implementation of the IaC concept. With Terraform, organizations can describe the desired state of their infrastructure in a simple configuration language (HCL) and let Terraform plan and apply these changes in an automated way.

Terraform (and IaC in general) has taken the software engineering world by storm in recent years. As someone who had the dubious pleasure of managing complex cloud infrastructure manually, using what is today jokingly called "ClickOps", I can mention a few properties of IaC that I believe contributed to this success:

  • Declarative – Terraform is built on a declarative workflow, which means that users only define the final (desired) state of their system. Terraform is responsible for inspecting the target environment, calculating the difference between the current and desired states, and building a plan for reconciling between those two states.

    Cloud infrastructures are becoming increasingly complex, comprising thousands of different, interconnected components. Declarative workflows greatly reduce the mental overhead of planning changes to such environments.

  • Automated – Many engineers can attest that manually provisioning a new environment used to take days, even weeks! Once Terraform generates a plan for changing environments, the process runs automatically and finishes in a matter of minutes.

  • Holistic – With Terraform, it is possible to capture all of the resources and configurations required to provision an application as one interconnected and formally defined dependency graph. Deployments become truly reproducible and automated, with no dangling or manually provisioned dependencies.

  • Self-healing – Finally, these three properties converge to support a self-healing tool that can detect and fix drift on its own. Whenever drift occurs, it is only a matter of re-running Terraform to shift from the current state back to the desired one.

Comparing IaC with Schema Management Tools

Next, let’s discuss the current state of database schema management tools (often called schema migration tools) by contrasting them with the properties of IaC.

  • Imperative – If Terraform embodies the declarative approach, then schema management tools often exemplify the opposite, imperative (or revision-based) approach. In this case, we don’t provide the tools with the what (the desired state of the database), but the how (what SQL commands need to run to migrate the database from the previous version to the next).

  • Semi-automated – Migration tools were revolutionary when they came out a decade ago. One idea stood as one of the harbingers of the GitOps philosophy: that database changes should not be applied manually but first checked into source control and then applied automatically by a tool.

    Today’s migration tools automate two aspects of schema management: 1) execution and 2) tracking which migrations were already executed on a target database.

    Compared to modern IaC tools, however, they are fairly manual. In other words, they leave the responsibility of planning and verifying the safety of changes to the user.

  • Fragmented – As we described above, one of the most pleasant aspects of adopting the IaC mindset is having a unified, holistic description of your infrastructure, to the point where you can entirely provision it from a single terraform apply command.

    For database schema management, common practices are anything but holistic. In some cases, provisioning the schema might happen 1) when application servers boot, before starting the application, or 2) while it runs as an init container on Kubernetes.

    In fact, some places (yes, even established companies) still have developers manually connect (with root credentials) to the production database to execute schema changes!

  • A pain to fix – When a migration deployment fails, many schema management tools will actually get in your way. Instead of worrying about fixing the issue at hand, you now need to worry about both your database and the way your migration tool sees it (which have now diverged).

Bridging the Gap

After describing the gap between IaC and database schema management in more detail, let’s delve into what it would take to bridge it. Our goal is to have schema management become an integral part of your day-to-day IaC pipeline so that you can enjoy all the positive properties we described above.

To integrate schema change management and IaC, we would need to solve two things:

  1. A diffing engine capable of supporting declarative migration workflows, such that an engine should be capable of:
    • Loading the desired schema of the database in some form
    • Inspecting the current schema of the database
    • Calculating a safe migration plan automatically
  2. A Terraform Provider that wraps the engine as a Terraform resource, which can then seamlessly integrate into your overall application infrastructure configuration.

How Atlas drives Declarative Migrations

Atlas is a language-agnostic tool for managing and migrating database schemas using modern DevOps principles. It is different from Terraform in many ways, but similar enough to have received the informal nickname "Terraform for Databases".

At its core lie three capabilities that make it ideal to apply a declarative workflow to schema management:

  1. Schema loaders
  2. Schema inspection
  3. Diffing and planning

Let’s discuss each of these capabilities in more detail.

Schema loaders

Every declarative workflow begins with the desired state - what we want the system to look like. Using a mechanism called "schema loaders" Atlas users can provide the desired schema in many ways. For example:

Plain SQL

Atlas users can describe the desired schema of the database using plain SQL DDL statements such as:

CREATE TABLE users (
Id int primary key,
Name varchar(255)
)

Atlas HCL

Alternatively, users can use Atlas HCL, a configuration language that shares Terraform’s configuration language foundations:

table "users" {
schema = schema.public
column "id" {
type = int
}
column "name" {
type = varchar(255)
}
column "manager_id" {
type = int
}
primary_key {
columns = [
column.id
]
}
}

A live database

In addition, users can provide Atlas with a connection to an existing database which in turn Atlas can inspect and use as the desired state of the database.

External Schemas (ORM)

Finally, Atlas has an easily extensible design which makes writing plugins to load schemas from external sources a breeze. For example, Atlas can read the desired schema of the database directly from your ORM, using a simple integration.

Schema inspection

Once Atlas understands the desired state of the database, it needs to inspect the existing database to understand its current schema. This is done by connecting to the target database and querying the database’s information schema to construct a schema graph (an in-memory representation of all the components in the database and their connections).

Diffing and planning

The next phase involves calculating the difference ("diffing") between the desired and current states and calculating an execution plan to reconcile this difference. Because resources are often interconnected, Atlas must create a sensible order of execution using algorithms such as Topological Sort to ensure, for example, that dependencies on a resource are removed before it is dropped.

In addition, each database engine has its own peculiarities and limitations to take into account when creating an execution plan. For example, adding a default value to a column in an SQLite database must be performed in a multiple-step plan that looks similar to this:

-- Planned Changes:
-- Create "new_users" table
CREATE TABLE `new_users` (`id` int NOT NULL, `greeting` text NOT NULL DEFAULT 'shalom')
-- Copy rows from old table "users" to new temporary table "new_users"
INSERT INTO `new_users` (`id`, `greeting`) SELECT `id`, IFNULL(`greeting`, 'shalom') AS `greeting` FROM `users`
-- Drop "users" table after copying rows
DROP TABLE `users`
-- Rename temporary table "new_users" to "users"
ALTER TABLE `new_users` RENAME TO `users`

Atlas in action

What does this workflow look like in practice? As you can see in Atlas's "Getting Started" guide, suppose we made a change to our desired schema that adds a new table named blog_posts (this change may be described in a plain SQL file, an HCL file or even in your ORM's data model).

To apply the desired schema on a target database you would use the schema apply command:

atlas schema apply \
-u "mysql://root:pass@localhost:3306/example" \
--to file://schema.sql \
--dev-url "docker://mysql/8/example"

After which Atlas will generate a plan:

-- Planned Changes:
-- Create "blog_posts" table
CREATE TABLE `example`.`blog_posts` (`id` int NOT NULL, `title` varchar(100) NULL, `body` text NULL, `author_id` int NULL, PRIMARY KEY (`id`), INDEX `author_id` (`author_id`), CONSTRAINT `author_fk` FOREIGN KEY (`author_id`) REFERENCES `example`.`users` (`id`))
Use the arrow keys to navigate: ↓ ↑ → ←
? Are you sure?:
▸ Apply
Abort

Observing this example, you may begin to understand how Atlas earned its nickname the "Terraform for Databases."

Integrating with Terraform

The second piece of bridging the gap is to create a Terraform Provider that wraps Atlas and allows users to define resources that represent the schema definition as part of your infrastructure.

Ariga (the company behind Atlas) is an official HashiCorp Tech Partner that publishes the Atlas Terraform Provider, which was created to solve this problem precisely.

Using the Atlas Terraform Provider, users can finally provision their database instance and its schema in one holistic definition. For example, suppose we provision a MySQL database using AWS RDS:

// Our RDS-based MySQL 8 instance.
resource "aws_db_instance" "atlas-demo" {
identifier = "atlas-demo"
instance_class = "db.t3.micro"
engine = "mysql"
engine_version = "8.0.28"
// Some fields skipped for brevity
}

Next, we load the desired schema from an HCL file, using the Atlas Provider:

data "atlas_schema" "app" {
src = "file://${path.module}/schema.hcl"
}

Finally, we use the atlas_schemaresource to apply our schema to the database:

// Apply the normalized schema to the RDS-managed database.
resource "atlas_schema" "hello" {
hcl = data.atlas_schema.app.hcl
url = "mysql://${aws_db_instance.atlas-demo.username}:${urlencode(random_password.password.result)}@${aws_db_instance.atlas-demo.endpoint}/"
}

You can find a full example here.

When we run terraform apply, this is what will happen:

  • Terraform will provision the RDS database using the AWS Provider
  • Terraform will use Atlas to inspect the existing schema of the database and load the desired state from a local HCL file.
  • Atlas will calculate for Terraform a SQL plan to reconcile between the two.

And this is how it may look like in the Terraform plan:

Terraform will perform the following actions:

# atlas_schema.hello will be created
+ resource "atlas_schema" "hello" {
+ hcl = <<-EOT
table "posts" {
schema = schema.app
column "id" {
null = false
type = int
}
column "user_id" {
null = false
type = int
}
column "title" {
null = false
type = varchar(255)
}
column "body" {
null = false
type = text
}
primary_key {
columns = [column.id]
}
foreign_key "posts_ibfk_1" {
columns = [column.user_id]
ref_columns = [table.users.column.id]
on_update = NO_ACTION
on_delete = CASCADE
}
index "user_id" {
columns = [column.user_id]
}
}
table "users" {
schema = schema.app
column "id" {
null = false
type = int
}
column "user_name" {
null = false
type = varchar(255)
}
column "email" {
null = false
type = varchar(255)
}
primary_key {
columns = [column.id]
}
}
schema "app" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
}
EOT
+ id = (known after apply)
+ url = (sensitive value)
}

# aws_db_instance.atlas-demo will be created
+ resource "aws_db_instance" "atlas-demo" {
// .. redacted for brevity
+ }

And that's how you bridge the gap between IaC and schema management!

Conclusion

In this blog post, we reviewed some exceptional properties of Infrastructure-as-Code tools, such as Terraform, that have led to their widespread adoption and success in the industry. We then reviewed the current state of a similar problem, database schema management, in contrast to these properties. Finally, we showcased Atlas’s ability to adapt some IaC principles into the domain of schema management and how we can unify the two domains using the Atlas Terraform Provider.

How can we make Atlas better?

We would love to hear from you on our Discord server ❤️.

· 9 min read
Rotem Tamir
TL;DR

You can now use plain SQL to define the desired state of your database schema.

See an example

Earlier today, we released v0.5.0 of the Atlas Terraform Provider. This release includes two changes that, in my opinion, are a great improvement to the experience of working with the Atlas Provider.

In this post, I will discuss these two changes and how they can help you to manage your database schemas with Terraform:

  • Support for the docker:// driver for dev-databases.
  • Support for defining the desired state of your database schema in plain SQL (and any other schema loader supported by Atlas).

Improving the dev-database experience with the docker:// driver

Atlas relies on a connection to an empty database which it can use to perform various calculations and operations. This database is called the "dev-database", and it allows Atlas to do things like validate the correctness of user-provided code as well as normalize user-input to the way the database actually sees it.

In previous versions of the Atlas Provider, the dev-database needed to be provided by the user. This was a bit cumbersome, as the user needed to spin up a database (usually by running a local Docker container), and then provide the connection string to it in the dev_url field of the atlas_schema resource.

To improve this experience, we added support for the docker:// driver, which allows the user to only provide the database engine and version, and Atlas will spin up an ephemeral container with the correct database engine and version. In addition, starting this version, users may define the dev_url on the provider scope. For example:

provider "atlas" {
dev_url = "docker://mysql/8/myapp"
}

Defining the desired state of the database schema in plain SQL

In earlier versions of the Atlas Provider, the atlas_schema resource required the user to provide an Atlas HCL file which describes the desired state of the database schema. Many users found this syntax, which resembles Terraform's own, to be clean and concise. However, others disliked it and asked for a way to define the desired state in plain SQL.

To support this use-case, and many others, we have announced support for "schema loaders" - components that can be used to load the desired schema from many kinds of sources (such as plain SQL, an existing database, or the data-model of an ORM). To use this capability, users may use the atlas_schema data source, which accepts a url field that points to the desired schema. The scheme of this URL determines which schema loader will be used, for instance:

  • file://schema.sql - loads the schema from a local SQL file.
  • mysql://root:pass@localhost:3306/myapp - loads the schema from an existing MySQL database.
  • ent://service/ent/schema - loads the schema from the schema of an Ent project.

Managing database schemas in plain SQL using Terraform

info

You can find the final code for this example here.

In the following example, we will show how you can use Terraform and the Atlas provider to manage a MySQL database schema in plain SQL.

Let's start by creating a Terraform file named main.tf installing the Atlas Terraform provider:

terraform {
required_providers {
atlas = {
source = "ariga/atlas"
version = "0.5.0"
}
}
}

In addition to installing the Atlas provider, we will also spin up a local MySQL database using Docker which will represent our target database that we will manage with Terraform. In a real-world scenario, you would probably use a managed database service such as AWS RDS or Google Cloud SQL, but for the purpose of brevity, a local database will suffice. Run:

docker run -d --name mysql -p 3306:3306 -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=myapp mysql:8

Now that we have a database to manage, we can define the desired state of the database schema. Add a file named "schema.sql" with the following content:

create table users (
id int not null auto_increment primary key,
name varchar(255) not null
);

Next, we will define an atlas_schema data source that will load the schema from the schema.sql file:

data "atlas_schema" "sql" {
src = "file://${path.module}/schema.sql"
}

Finally, we will define an atlas_schema resource that will manage the schema in the target database. In addition, we will configure the Atlas provider to use the docker:// driver to spin up a temporary database container:

provider "atlas" {
dev_url = "docker://mysql/8/myapp"
}

resource "atlas_schema" "mysql" {
url = "mysql://root:pass@localhost:3306/myapp"
hcl = data.atlas_schema.sql.hcl
}

Now that we have defined our Terraform configuration, we can run terraform init to install the required providers:

terraform init

This should output something like:

Initializing provider plugins...
- Finding ariga/atlas versions matching "0.4.7"...
- Installing ariga/atlas v0.5.0...
- Installed ariga/atlas v0.5.0 (signed by a HashiCorp partner, key ID 45441FCEAAC3770C)

# ...

Terraform has been successfully initialized!

Finally, we can run terraform apply to create the database schema:

terraform apply

Terraform will print the following plan:

data.atlas_schema.sql: Reading...
data.atlas_schema.sql: Read complete after 4s [id=qnUvTyupgQzivof5LYWDOQ]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following
symbols:
+ create

Terraform will perform the following actions:

# atlas_schema.myapp will be created
+ resource "atlas_schema" "myapp" {
+ hcl = <<-EOT
table "hello" {
schema = schema.myapp
column "world" {
null = true
type = text
}
column "thoughts" {
null = true
type = varchar(100)
}
}
schema "myapp" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
}
EOT
+ id = (known after apply)
+ url = (sensitive value)
}

Plan: 1 to add, 0 to change, 0 to destroy.

│ Warning: Atlas Plan

│ with atlas_schema.myapp,
│ on main.tf line 18, in resource "atlas_schema" "myapp":
│ 18: resource "atlas_schema" "myapp" {

│ The following SQL statements will be executed:


│ CREATE TABLE `myapp`.`hello` (`world` text NULL, `thoughts` varchar(100) NULL) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci



Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value:

Notice that the plan shows the SQL statements that will be executed to create the database schema as well as our loaded schema in its HCL representation - this was done by the schema loader that was used to load the schema from the schema.sql file.

If you are happy with the plan, type yes and press enter to apply the changes:

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

atlas_schema.myapp: Modifying... [id=mysql:///myapp]
atlas_schema.myapp: Modifications complete after 4s [id=mysql:///myapp]

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

Hooray! We have successfully created a database schema using Terraform and the Atlas provider.

Terraform's true power comes from its declarative nature - we feed it a desired state and it will make sure that the actual state matches the desired state. Atlas is a perfect match for this paradigm. Let's see what happens if we change the schema in the schema.sql file and run terraform apply again:

Update the contents of schema.sql to the following:

create table `groups` (
id int not null auto_increment primary key,
name varchar(255) not null
);

create table `users` (
id int not null auto_increment primary key,
name varchar(255) not null,
group_id int not null,
foreign key (group_id) references `groups` (id)
);

Re-apply the changes:

terraform apply

Observe that our plan includes the addition of the groups table as well as the foreign key constraint on the users table:

data.atlas_schema.sql: Reading...
data.atlas_schema.sql: Read complete after 4s [id=Qhci62i6CFYRQ2CmUOjMeA]
atlas_schema.myapp: Refreshing state... [id=mysql:///myapp]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following
symbols:
~ update in-place

Terraform will perform the following actions:

# atlas_schema.myapp will be updated in-place
~ resource "atlas_schema" "myapp" {
~ hcl = <<-EOT
+ table "groups" {
+ schema = schema.myapp
+ column "id" {
+ null = false
+ type = int
+ auto_increment = true
+ }
+ column "name" {
+ null = false
+ type = varchar(255)
+ }
+ primary_key {
+ columns = [column.id]
+ }
+ }
table "users" {
schema = schema.myapp
column "id" {
null = false
type = int
auto_increment = true
}
column "name" {
null = false
type = varchar(255)
}
+ column "group_id" {
+ null = false
+ type = int
+ }
primary_key {
columns = [column.id]
}
+ foreign_key "users_ibfk_1" {
+ columns = [column.group_id]
+ ref_columns = [table.groups.column.id]
+ on_update = NO_ACTION
+ on_delete = NO_ACTION
+ }
+ index "group_id" {
+ columns = [column.group_id]
+ }
}
schema "myapp" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
}
EOT
id = "mysql:///myapp"
# (1 unchanged attribute hidden)
}

Plan: 0 to add, 1 to change, 0 to destroy.

│ Warning: Atlas Plan

│ with atlas_schema.myapp,
│ on main.tf line 18, in resource "atlas_schema" "myapp":
│ 18: resource "atlas_schema" "myapp" {

│ The following SQL statements will be executed:


│ CREATE TABLE `myapp`.`groups` (`id` int NOT NULL AUTO_INCREMENT, `name` varchar(255) NOT NULL, PRIMARY KEY (`id`)) CHARSET
│ utf8mb4 COLLATE utf8mb4_0900_ai_ci
│ ALTER TABLE `myapp`.`users` ADD COLUMN `group_id` int NOT NULL, ADD INDEX `group_id` (`group_id`), ADD CONSTRAINT
│ `users_ibfk_1` FOREIGN KEY (`group_id`) REFERENCES `myapp`.`groups` (`id`) ON UPDATE NO ACTION ON DELETE NO ACTION



Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value:

After typing yes and pressing enter, Terraform will apply the changes, bringing the actual state of the database schema in line with the desired state:

atlas_schema.myapp: Modifying... [id=mysql:///myapp]
atlas_schema.myapp: Modifications complete after 4s [id=mysql:///myapp]

Apply complete! Resources: 0 added, 1 changed, 0 destroyed.

Conclusion

In this tutorial, we have seen how to use Terraform to manage the schema of a MySQL database using the Atlas provider with plain SQL. Using this workflow, teams can bridge the gap between their database schema management flows and their Terraform workflows, allowing for simpler and more reliable software delivery.

How can we make Atlas better?

We would love to hear from you on our Discord server ❤️.

· 5 min read
Tran Minh Giau

Introduction

Today we are very excited to announce the release of Atlas Terraform Provider v0.4.0. This release brings some exciting new features and improvements to the provider which we will describe in this post.

In addition, this release is the first to be published under our new partnership with HashiCorp as a Technology Partner. Atlas is sometimes described as a "Terraform for Databases", so we have high hopes that this partnership will help us to bring many opportunities to create better ways for integrating database schema management into IaC workflows.

What's new

When people first hear about integrating schema management into declarative workflows, many raise the concern that because making changes to the database is a high-risk operation, they would not trust a tool to do it automatically.

This is a valid concern, and this release contains three new features that we believe will help to address it:

  • SQL plan printing
  • Versioned migrations support
  • Migration safety verification

SQL Plan Printing

In previous versions of the provider, we displayed the plan as a textual diff showing which resources are added, removed or modified. With this version, the provider will also print the SQL statements that will be executed as part of the plan.

For example, suppose we have the following schema:

schema "market" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
comment = "A schema comment"
}

table "users" {
schema = schema.market
column "id" {
type = int
}
column "name" {
type = varchar(255)
}
primary_key {
columns = [
column.id
]
}
}

And our Terraform module looks like this:

terraform {
required_providers {
atlas = {
source = "ariga/atlas"
version = "0.4.0"
}
}
}

provider "atlas" {}

data "atlas_schema" "market" {
src = file("${path.module}/schema.hcl")
dev_db_url = "mysql://root:pass@localhost:3307"
}

resource "atlas_schema" "market" {
hcl = data.atlas_schema.market.hcl
url = "mysql://root:pass@localhost:3306"
dev_db_url = "mysql://root:pass@localhost:3307"
}

When we run terraform plan we will see the following output:

Plan: 1 to add, 0 to change, 0 to destroy.

│ Warning: Atlas Plan

│ with atlas_schema.market,
│ on main.tf line 17, in resource "atlas_schema" "market":
│ 17: resource "atlas_schema" "market" {

│ The following SQL statements will be executed:


│ -- add new schema named "market"
│ CREATE DATABASE `market`
│ -- create "users" table
│ CREATE TABLE `market`.`users` (`id` int NOT NULL, `name` varchar(255) NOT NULL, PRIMARY KEY (`id`)) CHARSET utf8mb4 COLLATE utf8mb4_0900_ai_ci

Versioned migrations

Atlas supports two types of workflows: Declarative and Versioned. With declarative workflows, the plan to migrate the database is generated automatically at runtime. Versioned migrations provide teams with a more controlled workflow where changes are planned, checked-in to source control and reviewed ahead of time. Until today, the Terraform provider only supported the declarative workflow. This release adds support for versioned migrations as well.

Suppose we have the following migration directory of two files:

20221101163823_create_users.sql
CREATE TABLE `users` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`age` bigint(20) NOT NULL,
`name` varchar(255) COLLATE utf8mb4_bin NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `age` (`age`)
);
atlas.sum
h1:OlaV3+7xXEWc1uG/Ed2zICttHaS6ydHZmzI7Hpf2Fss=
20221101163823_create_users.sql h1:mZirkpXBoLLm+M73EbHo07muxclifb70fhWQFfqxjD4=

We can use the Terraform Atlas provider to apply this migration directory to a database:

terraform {
required_providers {
atlas = {
source = "ariga/atlas"
version = "0.4.0"
}
}
}

provider "atlas" {}

// The `atlas_migration` data source loads the current state of the given database
// with regard to the migration directory.
data "atlas_migration" "hello" {
dir = "migrations?format=atlas"
url = "mysql://root:pass@localhost:3306/hello"
}

// The `atlas_migration` resource applies the migration directory to the database.
resource "atlas_migration" "hello" {
dir = "migrations?format=atlas"
version = data.atlas_migration.hello.latest # Use latest to run all migrations
url = data.atlas_migration.hello.url
dev_url = "mysql://root:pass@localhost:3307/test"
}

Running terraform plan will show the following output:

data.atlas_migration.hello: Reading...
data.atlas_migration.hello: Read complete after 0s [id=migrations?format=atlas]

Terraform used the selected providers to generate the following execution plan.
Resource actions are indicated with the following symbols:
+ create

Terraform will perform the following actions:

# atlas_migration.hello will be created
+ resource "atlas_migration" "hello" {
+ dev_url = (sensitive value)
+ dir = "migrations?format=atlas"
+ id = (known after apply)
+ status = (known after apply)
+ url = (sensitive value)
+ version = "20221101163823"
}

Plan: 1 to add, 0 to change, 0 to destroy.

Linting

Atlas provides extensive support for linting database schemas. This release adds support for linting schemas as part of the Terraform plan. This means that you can now run terraform plan and see if there are any linting errors in your schema. This is especially useful when you are using the versioned migrations workflow, as you can now run terraform plan to see if there are any linting errors in your schema before you apply the changes.

Suppose we add the following migration:

20221101165036_change_unique.sql
ALTER TABLE users
DROP KEY age,
ADD CONSTRAINT NAME UNIQUE (`name`);

If we run terraform plan on the above schema, Terraform prints the following warning:


│ Warning: data dependent changes detected

│ with atlas_migration.hello,
│ on main.tf line 20, in resource "atlas_migration" "hello":
│ 20: resource "atlas_migration" "hello" {

│ File: 20221101165036_change_unique.sql

│ - MF101: Adding a unique index "NAME" on table "users" might fail in case column
│ "name" contains duplicate entries

Atlas detected that the migration may fail in case the column name contains duplicate entries! This is a very useful warning that can help you avoid unpredicted failed deployments. Atlas supports many more safety checks, which you can read about here.

Wrapping up

In this blogpost we have discussed three new features that were added to the Terraform Atlas provider that are designed to make it safer and more predictable to manage your database schemas with Terraform. We hope you will enjoy this release!

Have questions? Feedback? Feel free to reach out on our Discord server.

· 5 min read
Amit Shani

Today we are glad to announce the release of the official Atlas Terraform Provider.

What is Terraform

Terraform is a popular open-source tool created by HashiCorp, used to greatly simplify the task of provisioning and managing resources in the cloud. With Terraform, organizations can describe the desired state of their infrastructure in a simple configuration language and let Terraform plan and apply these changes in an automated way. This way, Terraform allows teams to truly deliver infrastructure-as-code (IaC), which completely change how teams and organizations manage their cloud infrastructure.

Infrastructure-as-Code and database management

Most cloud-native applications are backed by a database. The database is often the most critical part of many software systems, so making changes to its schema (structure and layout of the data inside) is a very risky business. However, schemas must evolve: as functionality changes over time, the backing tables are added, columns are dropped, indexes are created for performance reasons, and more.

Therefore it is surprising that there is no established way of integrating the management of schema changes (commonly called schema "migrations") into popular Infrastructure-as-Code workflows. For this reason, many organizations are running migrations from within the application code or using solutions outside the ecosystem of Terraform, meaning that management of the production environment is fragmented and hard to synchronize. Atlas aims to change that.

The Atlas Terraform provider allows you to synchronize your database with your desired schema, in a safe and stateful manner. By using Atlas’s core migration engine and embedding it in a Terraform provider, we are enabling teams to manage their database schemas as part of their full IaC workflow. This way, teams can use existing providers (such as AWS or GCP) to provision the database instance and use the Atlas provider to keep the schema in sync. Integrating Atlas with Terraform is especially useful because it couples the state of the infrastructure with the state of the database. It is also extremely neat when using a dev database, which is a feature that combines infrastructure and DB management to provide safety and correctness.

Demo

Prerequisites

Make sure you have installed:

Let’s see an example of the provider in action. First, spin a database using docker:

docker run -p 3306:3306 --name iloveatlas -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=market -d mysql:8

Great! Now we have an instance of MySQL database running.

As an extra measure of safety, we will run another identical database which will serve as a Dev Database. In short, the dev-db helps to catch errors that can only be detected when applying the schema. It is also useful to format the schema in a correct and predictable way. Read more about it here. Run a second instance of MySQL on another port, to serve as a dev-db:

docker run -p 3307:3306 --name devdb-greatness -e MYSQL_ROOT_PASSWORD=pass -e MYSQL_DATABASE=market -d mysql:8

Next, we need an HCL file describing the desired state of our database. You can use atlas cli to inspect the state of another database or you can use the following basic schema:

schema.hcl
table "orders" {
schema = schema.market
column "id" {
null = false
type = int
auto_increment = true
}
column "name" {
null = false
type = varchar(20)
}
primary_key {
columns = [column.id]
}
}

schema "market" {
charset = "utf8mb4"
collate = "utf8mb4_0900_ai_ci"
}

Save the schema file locally in a file named schema.hcl.

Now that we have our database schema we can use terraform to apply that schema to our database. Create a file named main.tf and copy the following snippet:

main.tf
terraform {
required_providers {
atlas = {
version = "~> 0.4.0"
source = "ariga/atlas"
}
}
}

provider "atlas" {}

data "atlas_schema" "market" {
dev_db_url = "mysql://root:pass@localhost:3307/market"
src = file("${path.module}/schema.hcl")
}

resource "atlas_schema" "market" {
hcl = data.atlas_schema.market.hcl
url = "mysql://root:pass@localhost:3306/market"
dev_db_url = "mysql://root:pass@localhost:3307/market"
}

Finally, init terraform:

terraform init

And apply the schema to the database by executing:

terraform apply --auto-approve

Awesome! Now your database should have a table named orders. To verify that we can connect to the database:

$ docker exec -it iloveatlas mysql -ppass --database=market

mysql> show tables;
+------------------+
| Tables_in_market |
+------------------+
| orders |
+------------------+
1 row in set (0.00 sec)

mysql> show create table orders;
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| orders | CREATE TABLE `orders` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

For more examples and documentation visit the official GitHub repository or the provider page on Terraform registry.

What's next

In this post, we presented the Atlas Terraform Provider. The provider currently supports the basic, declarative migration workflow that is available in the Atlas engine. In upcoming versions, we will add support for an additional kind of workflow that is supported by the engine and is called versioned migration authoring. In addition, more advanced safety checks (such as simulation on database snapshots) and migration strategies are also being worked on.

While the Terraform provider has just been released, the core engine that it is driving, is well tested and widely used (especially as the migration engine backing the popular Ent framework.) If you, like me, have always wanted to manage your database schema as part of your team's infrastructure-as-code workflow, give the Atlas Terraform provider a try!

Have questions? Feedback? Find our team on our Discord server.