Preventing accidental column deletion on MySQL
Introduction
This document describes in detail the detection of dropping non-virtual columns for MySQL databases by Atlas's linting capabilities.
MySQL allows users to modify existing table structures using the
ALTER TABLE
statement. Using this statement, users can drop columns that are
no longer needed. For example:
ALTER TABLE t DROP COLUMN c;
While this is sometimes desired, deleting a column from an active database can have various forms of negative impact on an application which we describe below.
To prevent damage to production databases, many teams employ a policy that prohibits dropping columns (or database resources altogether). Atlas's migrate lint can help teams automatically detect migrations that drop columns during continuous integration. For example:
atlas migrate lint --dev-url mysql://root:pass@localhost:3306/dev --latest 1
May detect:
20221023123017.sql: destructive changes detected:
L1: Dropping non-virtual column "c2"
Impact of dropping columns
Data loss
Destructive changes are changes made to a database schema that result in loss of data. For instance, consider a statement such as:
ALTER TABLE `users` DROP COLUMN `email_address`;
This statement is considered destructive because whatever data is stored in the email_address column will be deleted from disk, with no way to recover it. If this happens accidentally during a schema change, the business impact of this change may be dire.
Contrary to many other types of changes, dropping a column is not a reversible operation. In other words, the "reverse" statement of the one above is not a symmetric operation. Consider this statement:
ALTER TABLE `users` ADD COLUMN `email_address` varchar(255);
Running this query may "roll-back" the column drop operation, but any data stored in this column will be lost.
Backwards Incompatibility
In the past, it was acceptable in many situations to stop an application for short maintenance windows during which database administrators could apply changes to the database schema. After the changes were applied, a new version of the application, depending on the new schema, would be deployed.
Today, most applications that serve traffic via the internet are expected to be available 24/7, making such maintenance windows unacceptable. Dropping a column while running an application that relies on it will cause the database to return errors. This type of change is also known as a backwards-incompatible change.
ERROR 1054 (42S22): Unknown column
Suppose we dropped a column named "name". A running application may rely on this column to be present and try to read or write records including it:
insert into t (id, name) values (1,2);
--- OR
select name from t;
MySQL will return:
ERROR 1054 (42S22): Unknown column 'name' in 'field list'
Prevention
Preventing accidental column deletion is easy with Atlas's migrate lint
command. With the atlas migrate lint
command, users can analyze the migration directory to
detect potentially dangerous changes to the database schema. This command may be
incorporated in continuous integration pipelines to enable teams to enforce
desired policies with regard to schema changes.
When using migrate lint to analyze migrations, users must supply multiple parameters:
--dev-url
- a URL to a Dev-database that will be used to simulate the changes and verify their correctness.--dir
- the URL of the migration directory, by default it is file://migrations, e.g a directory named migrations in the current working directory.
Changeset detection
When we run the lint command, we need to instruct Atlas on how to decide what set of migration files to analyze. Currently, two modes are supported:
--git-base <branchName>
: which selects the diff between the provided branch and the current one as the changeset.--latest <n>
which selects the latest n migration files as the changeset.
Examples
Analyze all changes relative to the master
Git branch:
atlas migrate lint \
--dir "file://my/project/migrations" \
--dev-url "mysql://root:pass@localhost:3306/dev" \
--git-base "master"
Analyze the latest 2 migration files:
atlas migrate lint \
--dir "file://my/project/migrations" \
--dev-url "mysql://root:pass@localhost:3306/dev" \
--latest 2