Safe NOT NULL Migrations on PostgreSQL: Avoid Aborted Deploys and Table Locks
The Problem
Adding a NOT NULL constraint to an existing column is a routine schema change, but on PostgreSQL, a
plain ALTER TABLE ... ALTER COLUMN ... SET NOT NULL has two failure modes that are easy to miss when
testing locally or against data that does not represent production:
- Data-dependent change: The column might already hold
NULLs in production. If it does,SET NOT NULLfails and the migration fails with it. Since success depends on the data, the migration can pass on a local or test database and still fail in production where theNULLs actually exist. Atlas flags this asMF104. - Blocking table change: Even when no
NULLs exist, PostgreSQL still has to verify it.SET NOT NULLacquires anACCESS EXCLUSIVElock, which blocks reads as well as writes, and holds it while it scans every row in the table. The scan is instant on a small table, but on a large or busy one, the lock can be held long enough to stall queries and the deployment. Atlas reports this asPG303.
The safe way to make this change is well-known: backfill the data, then add the constraint without a blocking scan. Doing this by hand for every column is tedious and easy to get wrong, so Atlas automates it for you.
Enter: Atlas
Atlas can plan this change safely for you. You configure the behavior once in atlas.hcl, and it is
applied to every migration Atlas generates, locally and in CI. The rest of this guide sets this up.
- macOS + Linux
- Homebrew
- Docker
- Windows
- CI
- Manual Installation
To download and install the latest release of the Atlas CLI, simply run the following in your terminal:
curl -sSf https://atlasgo.sh | sh
Get the latest release with Homebrew:
brew install ariga/tap/atlas
To pull the Atlas image and run it as a Docker container:
docker pull arigaio/atlas
docker run --rm arigaio/atlas --help
If the container needs access to the host network or a local directory, use the --net=host flag and mount the desired
directory:
docker run --rm --net=host \
-v $(pwd)/migrations:/migrations \
arigaio/atlas migrate apply \
--url "mysql://root:pass@:3306/test"
Download the latest release and move the atlas binary to a file location on your system PATH.
GitHub Actions
Use the setup-atlas action to install Atlas in your GitHub Actions workflow:
- uses: ariga/setup-atlas@v0
with:
cloud-token: ${{ secrets.ATLAS_CLOUD_TOKEN }}
Other CI Platforms
For other CI/CD platforms, use the installation script. See the CI/CD integrations for more details.
The not_null PostgreSQL diff policy is available only to Atlas Pro users. To use this feature, run:
atlas login
The Solution
By default, when Atlas detects a nullability change in your schema, it generates a migration with a
plain SET NOT NULL:
-- Modify "users" table
ALTER TABLE "users" ALTER COLUMN "name" SET NOT NULL;
We already covered why this change is risky. Atlas's linting catches it too: atlas migrate lint
reports both the data-dependent risk (MF104) and the full table scan
(PG303):
Analyzing changes from version 20260616100000 to 20260616120000 (1 migration in total):
-- analyzing version 20260616120000
-- data dependent changes detected:
-- L3: Modifying nullable column "name" to non-nullable might fail in case it contains NULL values
https://atlasgo.io/lint/analyzers#MF104
-- blocking table changes detected:
-- L3: Modifying nullable column "name" to non-nullable requires a full table scan
https://atlasgo.io/lint/analyzers#PG303
-- ok (2.1ms)
-------------------------
-- 8.4ms
-- 1 version with warnings
-- 1 schema change
-- 2 diagnostics
Atlas can resolve both of these risks with its not_null diff policy,
which we configure in two steps:
- Add a pre-migration check that stops a migration before it runs if a column
still holds
NULLs. This handles the data-dependent risk (MF104). - Switch to the lock-safe strategy, which applies the
NOT NULLwithout ever holding anACCESS EXCLUSIVElock for a table scan. This handles the blocking risk (PG303), with trade-offs which we cover in that step.
The following sections walk through each step.
Step 1: Gate the Change with a Pre-Migration Check
We want Atlas to verify that our column has no NULLs before the migration runs, and to stop the deployment
if it finds any. We instruct it to do so by setting check in the not_null block in atlas.hcl:
diff "postgres" {
not_null {
check = true
}
}
With check on, Atlas no longer writes a plain SQL file. Instead, it writes the version as a
txtar archive: a text-based format for a complete migration
plan in which each section begins with a -- FILENAME -- marker. Atlas puts the
pre-migration check in a checks.sql section and the DDL in a migration.sql
section:
-- atlas:txtar
-- checks.sql --
-- atlas:assert MF104
SELECT NOT EXISTS (SELECT 1 FROM "users" WHERE "name" IS NULL) AS "not_null";
-- migration.sql --
-- Modify "users" table
ALTER TABLE "users" ALTER COLUMN "name" SET NOT NULL;
The checks.sql assertion must pass before migration.sql runs, so if "name" contains any NULL values,
the deployment stops upfront with a clear message rather than of failing halfway through. That solves the first
problem: atlas migrate lint no longer reports MF104. The DDL itself is unchanged, though, so the
full table scan (PG303) is still flagged:
Analyzing changes from version 20260616100000 to 20260616120000 (1 migration in total):
-- analyzing version 20260616120000
-- blocking table changes detected:
-- L3: Modifying nullable column "name" to non-nullable requires a full table scan
https://atlasgo.io/lint/analyzers#PG303
-- ok (1.8ms)
-------------------------
-- 7.9ms
-- 1 version with warnings
-- 1 schema change
-- 1 diagnostic
The data-dependent risk is solved, but the lock (PG303) is still there.
Step 2: Remove the ACCESS EXCLUSIVE Lock
Now, we want Atlas to make the column NOT NULL without holding an ACCESS EXCLUSIVE lock for a full
table scan, which puts us at risk for downtime on a large or busy table. We instruct it to do so by
setting lock_safe alongside check in the not_null block:
diff "postgres" {
not_null {
check = true
lock_safe = true
}
}
lock_safe relies on a property of PostgreSQL: a NOT VALID constraint is added without scanning the
table, so it only constrains future rows. It can later be validated under a SHARE UPDATE EXCLUSIVE
lock, which does not block reads or writes. Once a validated constraint proves the column has no
NULLs, the actual SET NOT NULL is instant because PostgreSQL skips its own scan when an existing
constraint already guarantees the invariant.
The exact form depends on your PostgreSQL version:
- PostgreSQL 18
- PostgreSQL below 18
On PostgreSQL 18, NOT NULL is a first-class, nameable catalog constraint (contype = 'n') that
supports the NOT VALID clause. Atlas uses it directly, so the validated constraint replaces
SET NOT NULL entirely: there is no separate ALTER COLUMN step, and nothing to drop afterward.
-- atlas:txtar
-- checks.sql --
-- atlas:assert MF104
SELECT NOT EXISTS (SELECT 1 FROM "users" WHERE "name" IS NULL) AS "not_null";
-- migration.sql --
-- atlas:txmode none
-- Add a not-null constraint to column "name" without scanning the table
ALTER TABLE "users" ADD CONSTRAINT "users_name_not_null" NOT NULL "name" NOT VALID;
-- Validate constraint "users_name_not_null"
ALTER TABLE "users" VALIDATE CONSTRAINT "users_name_not_null";
The constraint name users_name_not_null follows PostgreSQL's own convention for the implicit name a
plain SET NOT NULL would generate (<table>_<column>_not_null), so a later inspection folds it back
into the column's nullability with no drift.
Versions below 18 have no NOT NULL ... NOT VALID syntax, so Atlas emulates it with a temporary
CHECK (col IS NOT NULL) constraint. The sequence per column is:
ADD CONSTRAINT ... CHECK (col IS NOT NULL) NOT VALID- instant, no scan.VALIDATE CONSTRAINT ...- scans underSHARE UPDATE EXCLUSIVE(reads and writes continue).ALTER COLUMN ... SET NOT NULL- instant; PostgreSQL skips the scan because the validatedCHECKalready proves noNULLs.DROP CONSTRAINT ...- the now-redundant temporary check is removed.
-- atlas:txtar
-- checks.sql --
-- atlas:assert MF104
SELECT NOT EXISTS (SELECT 1 FROM "users" WHERE "name" IS NULL) AS "not_null";
-- migration.sql --
-- atlas:txmode none
-- Add a temporary CHECK constraint covering the nullability of column "name"
ALTER TABLE "users" ADD CONSTRAINT "users_name_not_null" CHECK ("name" IS NOT NULL) NOT VALID;
-- Validate constraint "users_name_not_null"
ALTER TABLE "users" VALIDATE CONSTRAINT "users_name_not_null";
-- Set column "name" to NOT NULL
ALTER TABLE "users" ALTER COLUMN "name" SET NOT NULL;
-- Drop the temporary CHECK constraint "users_name_not_null"
ALTER TABLE "users" DROP CONSTRAINT "users_name_not_null";
DROP is a separate statementPostgreSQL only skips the SET NOT NULL scan if the covering constraint is not dropped in the same
command. Atlas therefore drops the temporary check in its own statement, after the SET NOT NULL.
In both versions the checks.sql is unchanged from Step 1; lock_safe only rewrites the highlighted
migration.sql, so the change stays guarded against dirty data while shedding the lock.
Why the transaction is disabled
Both plans above, on PostgreSQL 18 and below, open with -- atlas:txmode none. That directive tells
Atlas not to wrap the migration in a single transaction, but to run each statement on its own. The
reason comes down to how PostgreSQL handles locks: once acquired, a lock is held until the end of the
transaction.
ADD CONSTRAINT ... NOT VALID is instant on its own because it does not scan the table, but it still
takes a brief ACCESS EXCLUSIVE lock to record the new constraint. Inside a single transaction, that
lock would be held until the transaction commits. The VALIDATE that follows (which by itself takes
only a SHARE UPDATE EXCLUSIVE lock) would run its table scan while the table is still under the
ACCESS EXCLUSIVE lock held by the preceding ADD. That is the exact lock we set out to avoid.
Running each statement on its own lets the ADD commit and release its lock immediately, so VALIDATE
scans under the lighter SHARE UPDATE EXCLUSIVE lock instead. With both options on, the migration
applies the NOT NULL without an ACCESS EXCLUSIVE table scan, and atlas migrate lint passes with
no diagnostics.
lock_safe builds on checklock_safe cannot be used on its own. Removing the lock without first proving the data is clean would
just move the abort from the scan to the SET NOT NULL itself, so Atlas requires check = true
whenever lock_safe is set. Configuring lock_safe alone is an error:
atlas.hcl:3,5-13: attribute check is required when lock_safe is set
When Atlas Skips the Lock-Safe Plan
The full NOT VALID dance is not always needed. When a plain SET NOT NULL would not scan the table
anyway, the extra statements buy nothing, so Atlas falls back to the simplest plan that is still
lock-safe. This is usually a plain SET NOT NULL. This happens in the following cases:
The column is already covered by a CHECK
If the table already has a validated CHECK (col IS NOT NULL) constraint that this migration did not
add, PostgreSQL skips the scan on SET NOT NULL by itself: a valid check already proves no NULLs
exist, so there is nothing left to verify. On the other hand, a NOT VALID check would not help
here, since PostgreSQL has not confirmed it holds for the existing rows. Atlas emits a plain
SET NOT NULL, with no NOT VALID constraint and no txmode none:
-- Modify "users" table
ALTER TABLE "users" ALTER COLUMN "bio" SET NOT NULL;
check = true still adds the MF104 pre-check, and atlas migrate lint recognizes the covering
constraint too, so it does not report PG303. The diff and the linter agree the change is already lock-safe.
The migration adds a covering CHECK
If the same migration both adds a CHECK (col IS NOT NULL) and sets the column NOT NULL, Atlas reuses
that constraint instead of creating a throwaway one: it adds the check NOT VALID, validates it, sets
the column NOT NULL, and keeps the check, since it is part of your desired schema:
-- atlas:txmode none
-- Add the "users_bio_check" constraint covering the nullability of column "bio" without scanning the table
ALTER TABLE "users" ADD CONSTRAINT "users_bio_check" CHECK ("bio" IS NOT NULL) NOT VALID;
-- Validate constraint "users_bio_check"
ALTER TABLE "users" VALIDATE CONSTRAINT "users_bio_check";
-- Modify "users" table
ALTER TABLE "users" ALTER COLUMN "bio" SET NOT NULL;
A single CHECK that covers several columns going NOT NULL in the same migration is added and
validated only once.
Another change already rewrites the table
Some changes rewrite the whole table under an ACCESS EXCLUSIVE lock regardless: adding a column
with a volatile default, a stored generated column, or an identity column; switching the table to
UNLOGGED; or changing its access method.
When one of these is in the same migration, a separate NOT VALID and VALIDATE would only add a
second scan, so Atlas lets the SET NOT NULL ride along on the combined ALTER as a plain statement,
with no txmode none and no temporary constraint:
-- Modify "users" table
ALTER TABLE "users" SET UNLOGGED, ALTER COLUMN "name" SET NOT NULL;
Summary
Adding a NOT NULL constraint to a PostgreSQL column carries two risks: the migration can fail on
existing NULLs, and SET NOT NULL can lock the table for a full scan.
The not_null diff policy removes both: check adds a pre-migration check that stops the deployment
before any DDL when the column still holds NULLs, and lock_safe plans the change with a validated
NOT VALID constraint so the NOT NULL is applied without an ACCESS EXCLUSIVE scan.
| Without policy | check | check + lock_safe | |
|---|---|---|---|
Existing NULLs | Migration fails mid-run | Caught upfront by MF104 | Caught upfront by MF104 |
Table scan under ACCESS EXCLUSIVE | Yes | Yes | No, validated under SHARE UPDATE EXCLUSIVE |
| Lint result | MF104 and PG303 | PG303 | Clean |
For zero-downtime NOT NULL migrations on PostgreSQL, enable both knobs. Together they let the column
become NOT NULL while the table stays readable and writable throughout, with no risk of a failed
deploy:
diff "postgres" {
not_null {
check = true
lock_safe = true
}
}
Next Steps
Atlas plugs into every major CI platform to run migration linting on each pull request, so unsafe changes are caught in review before they reach production:
GitHub Actions
Run migration linting on every pull request with GitHub Actions.
GitLab CI
Run migration linting in your GitLab merge-request pipelines.
Bitbucket
Run migration linting in Bitbucket Pipelines.
Azure DevOps
Run migration linting in your Azure DevOps pipelines.
CircleCI
Run migration linting in your CircleCI workflows.
Destructive Change Policy
Block accidental DROP TABLE and DROP COLUMN in CI with the same diff-policy model.
Migration Analyzers
Browse every built-in check, including PG303 and MF104, and tune which ones fail the build.
Pre-migration Checks
Gate any deployment on a custom assertion that must pass before the migration runs.
Diff Policy Reference
Configure the full diff postgres block and other plan-time policies in atlas.hcl.