Migration Analyzers
The database is often the most critical component in software architectures. Being a stateful component, it cannot be easily rebuilt, scaled-out or fixed by a restart. Outages that involve damage to data or simply unavailability of the database are notoriously hard to manage and recover from, often taking long hours of careful work by a team's most senior engineers.
As most outages happen directly as a result of a change to a system, Atlas provides users with means to verify the
safety of planned changes before they happen. The sqlcheck
package provides interfaces for analyzing the contents of SQL files to generate insights on the safety of many kinds of
changes to database schemas. With this package developers may define an Analyzer
that can be used to diagnose the impact
of SQL statements on the target database.
Using these interfaces, Atlas provides different Analyzer
implementations that are useful for determining the
safety of migration scripts.
Analyzers​
Below are the Analyzer
implementations currently supported by Atlas.
Destructive Changes​
Destructive changes are changes to a database schema that result in loss of data. For instance, consider a statement such as:
ALTER TABLE `users` DROP COLUMN `email_address`;
This statement is considered destructive because whatever data is stored in the email_address
column
will be deleted from disk, with no way to recover it. There are definitely situations where this type
of change is desired, but they are relatively rare. Using the destructive
(GoDoc)
Analyzer, teams can detect this type of change and design workflows that prevent it from happening accidentally.
Running migration linting locally on in CI fails with exit code 1 in case destructive changes are detected. However,
users can disable this by configuring the destructive
analyzer in the atlas.hcl
file:
lint {
destructive {
error = false
}
}
Data-dependent Changes​
Data-dependent changes are changes to a database schema that may succeed or fail, depending on the data that is stored in the database. For instance, consider a statement such as:
ALTER TABLE `example`.`orders` ADD UNIQUE INDEX `idx_name` (`name`);
This statement is considered data-dependent because if the orders
table
contains duplicate values on the name column we will not be able to add a uniqueness
constraint. Consider we added two records with the name atlas
to the table:
mysql> create table orders ( name varchar(100) );
Query OK, 0 rows affected (0.11 sec)
mysql> insert into orders (name) values ("atlas");
Query OK, 1 row affected (0.06 sec)
mysql> insert into orders (name) values ("atlas");
Query OK, 1 row affected (0.01 sec)
Attempting to add a uniqueness constraint on the name
column, will fail:
mysql> ALTER TABLE `example`.`orders` ADD UNIQUE INDEX `idx_name` (`name`);
This type of change is tricky because a developer trying to simulate it locally
might succeed in performing it only to be surprised that their migration script
fails in production, breaking a deployment sequence or causing other unexpected
behavior. Using the data_depend
(GoDoc)
Analyzer, teams can detect this risk early and account for it in pre-deployment checks to a database.
By default, data-dependent changes are reported but not cause migration linting to fail. Users can change this by
configuring the data_depend
analyzer in the atlas.hcl
file:
lint {
data_depend {
error = true
}
}
Backward Incompatible Changes​
Backward-incompatible changes, also known as breaking changes, are schema changes that have the potential to break the contract with applications that rely on
the old schema. For instance, renaming a column from email_address
to email
can cause errors during deployment
(migration) phase if applications running the previous version of the schema reference the old column name in their
queries.
By default, detected breaking changes are reported but do not cause migration linting to fail. Users can change this by
configuring the incompatible
analyzer in the atlas.hcl
file:
lint {
incompatible {
error = true
}
}
Naming Conventions Policy​
In database schema design, maintaining consistency and readability through naming conventions is a widely common practice. Atlas provides an analyzer that can help enforce naming conventions on a variety of schema resources, including tables, columns, and indexes.
Users can enable this by configuring the naming
analyzer in their
atlas.hcl
file:
- Global policy
- Resource-specific policy
lint {
naming {
match = "^[a-z]+$"
message = "must be lowercase"
}
}
lint {
naming {
match = "^[a-z]+$"
message = "must be lowercase"
index {
match = "^[a-z]+_idx$"
message = "must be lowercase and end with _idx"
}
// schema, table, column, foreign_key and check are also supported.
}
}
By default, detected naming violations are reported but do not cause migration linting to fail. Users can change this by
configuring the naming
analyzer in the atlas.hcl
file:
lint {
naming {
error = true
match = "^[a-z]+$"
message = "must be lowercase"
}
}
Concurrent Index Policy (PostgreSQL)​
Schema changes like CREATE INDEX
or DROP INDEX
can cause the database to lock the table against write operations.
Luckily, PostgreSQL provides the
CONCURRENTLY
option that
may be more resource-intensive, but allows normal database operations to continue while the index is built or dropped.
Atlas provides an analyzer that identifies non-concurrent index creation or deletion for tables not created within the same file, and recommends executing them concurrently.
Additionally, since indexes cannot be created or deleted concurrently within a transaction, Atlas ensures the
atlas:txmode none
directive exists in the file header to prevent this file from running in a transaction. This check
can be disabled along with the other ones as follows:
lint {
concurrent_index {
check_create = false // `true` by default.
check_drop = false // `true` by default.
check_txmode = false // `true` by default.
}
}
By default, detected concurrent index violations are reported but do not cause migration linting to fail. Users can
change this by configuring the concurrent_index
analyzer in the
atlas.hcl
file:
lint {
concurrent_index {
error = true
}
}
Checks​
The following schema change checks are provided by Atlas:
Check | Short Description |
---|---|
AR | Atlas cloud checks |
AR101 | Creating table with non-optimal data alignment |
BC | Backward incompatible changes |
BC101 | Renaming a table |
BC102 | Renaming a column |
CD | Constraint deletion changes |
CD101 | Foreign-key constraint was dropped |
DS | Destructive changes |
DS101 | Schema was dropped |
DS102 | Table was dropped |
DS103 | Non-virtual column was dropped |
LT | SQLite specific checks |
LT101 | Modifying a nullable column to non-nullable without a DEFAULT value |
MF | Data-dependent changes (changes that might fail) |
MF101 | Add unique index to existing column |
MF102 | Modifying non-unique index to unique |
MF103 | Adding a non-nullable column to an existing table |
MF104 | Modifying a nullable column to non-nullable |
MY | MySQL and MariaDB specific checks |
MY101 | Adding a non-nullable column without a DEFAULT value to an existing table |
MY102 | Adding a column with an inline REFERENCES clause has no actual effect |
NM | Naming Conventions |
NM101 | Schema name violates the naming convention |
NM102 | Table name violates the naming convention |
NM103 | Column name violates the naming convention |
NM104 | Index name violates the naming convention |
NM105 | Foreign-key constraint name violates the naming convention |
NM106 | Check constraint name violates the naming convention |
PG1 | Concurrent Indexes |
PG101 | Missing the CONCURRENTLY in index creation |
PG102 | Missing the CONCURRENTLY in index deletion |
PG103 | Missing atlas:txmode none directive in file header |
DS101​
Destructive change that is reported when a database schema was dropped. For example:
DROP SCHEMA test;
DS102​
Destructive change that is reported when a table schema was dropped. For example:
DROP TABLE test.t;
DS103​
Destructive change that is reported when a non-virtual column was dropped. For example:
ALTER TABLE t DROP COLUMN c;
MF101​
Adding a unique index to a table might fail in case one of the indexed columns contain duplicate entries. For example:
CREATE UNIQUE INDEX i ON t(c);
MF102​
Modifying a non-unique index to be unique might fail in case one of the indexed columns contain duplicate entries.
Since index modification is done with DROP
and CREATE
, this check will be reported only when analyzing changes
programmatically or when working with the declarative workflow.
MF103​
Adding a non-nullable column to a table might fail in case the table is not empty. For example:
ALTER TABLE t ADD COLUMN c int NOT NULL;
MF104​
Modifying nullable column to non-nullable might fail in case it contains NULL values. For example:
ALTER TABLE t MODIFY COLUMN c int NOT NULL;
The solution, in this case, is to backfill NULL
values with a default value:
UPDATE t SET c = 0 WHERE c IS NULL;
ALTER TABLE t MODIFY COLUMN c int NOT NULL;
BC101​
Renaming a table is a backward-incompatible change that can cause errors during deployment (migration) if applications running the previous version of the schema refer to the old name in their statements. For example:
ALTER TABLE `users` RENAME TO `Users`;
Unlike other checks, there is no single correct way to resolve this one. Here are some possible solutions:
It's likely that this change was introduced when you renamed one of the entities in your ORM, and the linter helped you catch the potential problem this could cause. In such cases, most ORM frameworks allow you to rename the entity while still pointing to its previous table name. e.g., here is how you can do it in Ent and in GORM.
If renaming is desired but the previous version of the application uses the old table name, a temporary
VIEW
can be created to mimic the previous schema version in the deployment phase. However, the downside of this solution is that mutations using the old table name will fail. Yet, if Atlas detects a consecutive statement with aCREATE VIEW <old_name>
, it will ignore this check.ALTER TABLE `users` RENAME TO `Users`;
CREATE VIEW `users` AS SELECT * FROM `Users`;If renaming the table is desired and no clients depend on it yet, or if it is acceptable to return errors during migration phase when traffic is minimal, you can configure Atlas to ignore this check with the following directive:
-- atlas:nolint BC101
ALTER TABLE `users` RENAME TO `Users`;
BC102​
Renaming a column is a backward-incompatible change that can cause errors during deployment (migration) if applications running the previous version of the schema refer to the old column name in their statements. For example:
ALTER TABLE `users` RENAME COLUMN `user_name` TO `name`;
Unlike other checks, there is no single correct way to resolve this one. Here are some possible solutions:
It's likely that this change was introduced when you renamed a field in one of the entities in your ORM, and the linter helped you catch the potential problem this could cause. In such cases, most ORM frameworks allow you to rename a field while still pointing to its previous column name. e.g., in Ent you can configure it using the
StorageKey
option, and in GORM you can set it by adding thecolumn
struct tag.If renaming is desired but the previous version of the application uses the old column name, a temporary
VIRTUAL
generated column can be created to mimic the previous schema version in the deployment phase. However, the downside of this solution is that mutations using the old column name will fail. Yet, if Atlas detects a consecutive command with such a column, it will ignore this check.-- For MySQL or MariaDB:
ALTER TABLE `posts` RENAME COLUMN `id` TO `uid`, ADD COLUMN `id` int AS (`uid`);
-- SQLite:
ALTER TABLE `posts` RENAME COLUMN `id` TO `uid`;
ALTER TABLE `posts` ADD COLUMN `id` int AS (`uid`);If renaming the column is desired and no clients depend on it yet, or if it is acceptable to return errors during the migration phase when traffic is minimal, you can configure Atlas to ignore this check with the following directive:
-- atlas:nolint BC102
ALTER TABLE `posts` RENAME COLUMN `id` TO `uid`;
CD101​
Constraint deletion is reported when a foreign-key constraint was dropped. For example:
ALTER TABLE pets DROP CONSTRAINT owner_id;
MY101​
Adding a non-nullable column to a table without a DEFAULT
value implicitly sets existing rows with the column
zero (default) value. For example:
ALTER TABLE t ADD COLUMN c int NOT NULL;
-- Append column `c` to all existing rows with the value 0.
MY102​
Adding a column with an inline REFERENCES
clause has no actual effect. Users should define a separate FOREIGN KEY
specification instead. For example:
-CREATE TABLE pets (owner_id int REFERENCES users(id));
+CREATE TABLE pets (owner_id int, FOREIGN KEY (owner_id) REFERENCES users(id));
LT101​
Modifying a nullable column to non-nullable without setting a DEFAULT
might fail in case it contains NULL
values.
The solution is one of the following:
1. Set a DEFAULT
value on the modified column:
-- create "new_users" table
CREATE TABLE `new_users` (`a` int NOT NULL DEFAULT 1);
-- copy rows from old table "users" to new temporary table "new_users"
INSERT INTO `new_users` (`a`) SELECT IFNULL(`a`, 1) FROM `users`;
-- drop "users" table after copying rows
DROP TABLE `users`;
-- rename temporary table "new_users" to "users"
ALTER TABLE `new_users` RENAME TO `users`;
2. Backfill NULL
values with a default value:
-- backfill previous rows
UPDATE `users` SET `a` = 1 WHERE `a` IS NULL;
-- disable the enforcement of foreign-keys constraints
PRAGMA foreign_keys = off;
-- create "new_users" table
CREATE TABLE `new_users` (`a` int NOT NULL);
-- copy rows from old table "users" to new temporary table "new_users"
INSERT INTO `new_users` (`a`) SELECT `a` FROM `users`;
-- drop "users" table after copying rows
DROP TABLE `users`;
-- rename temporary table "new_users" to "users"
ALTER TABLE `new_users` RENAME TO `users`;
-- enable back the enforcement of foreign-keys constraints
PRAGMA foreign_keys = on;
AR101​
Creating a table with optimal data alignment may help minimize the amount of required disk space. For example consider the next Postgres table on a 64-bit system:
CREATE TABLE accounts (
id bigint PRIMARY KEY,
premium boolean,
balance integer,
age smallint
);
Each tuple in the table takes 24 bytes of successive memory without the header.
the id
attribute takes 8 bytes, the premium
takes 1 byte and 3 bytes of padding, the balance
takes 4 bytes and the age
takes 2 bytes,
and lastly 6 bytes of padding allocated for the end of the row.
In total 9 bytes of padding are allocated for each row.
Compared to same table with different ordering which only takes 16 bytes in memory with 1 byte of padding:
CREATE TABLE accounts (
id bigint PRIMARY KEY,
balance integer,
age smallint,
premium boolean
);
NM101​
A schema has been given a name that violates the naming convention.
NM102​
A table has been given a name that violates the naming convention.
NM103​
A column has been given a name that violates the naming convention.
NM104​
An index has been given a name that violates the naming convention.
NM105​
A foreign-key constraint has been given a name that violates the naming convention.
NM106​
A check constraint has been given a name that violates the naming convention.
PG101​
Creating an index non-concurrently locks the table against write operations.
PG102​
Dropping an index non-concurrently locks the table against write operations.
PG103​
Indexes cannot be created or deleted concurrently within a transaction. Add the atlas:txmode none
directive to the
header to prevent this file from running in a transaction.