Skip to main content

The Atlas Kubernetes Operator

Background

What are Kubernetes Operators?

For many stateful resources, reconciling the desired state of a database with its actual state can be a complex task that requires a lot of domain knowledge. Kubernetes Operators were introduced to the Kubernetes ecosystem to help users manage complex stateful resources by codifying this domain knowledge into a Kubernetes controller.

What is the Atlas Kubernetes Operator?

The Atlas Kubernetes Operator is a Kubernetes controller that uses Atlas to manage your database schema. The Atlas Kubernetes Operator allows you to define the desired schema and apply it to your database using the Kubernetes API.

Supported workflows

Declarative schema migrations

The Atlas Kubernetes Operator supports declarative migrations. In declarative migrations, the desired state of the database is defined by the user and the operator is responsible for reconciling the desired state with the actual state of the database (planning and executing CREATE, ALTER and DROP statements).

Getting started

Installation

The Atlas Kubernetes Operator is available as a Helm Chart. To install the chart with the release name atlas-operator:

helm install atlas-operator oci://ghcr.io/ariga/charts/atlas-operator

Credentials

In order to manage the schema of your database, you must provide the Atlas Kubernetes Operator with an Atlas URL to your database. The Atlas Kubernetes Operator will use this URL to connect to your database and apply changes to the schema. As this string usually contains sensitive information, we recommend storing it as a Kubernetes secret.

Notice that depending on your use-case, you may want to use either a schema-bound or a database-bound connection.

  • Schema-bound connections are recommended for most use-cases, as they allow you to manage the schema of a database schema (named database). This is akin to selecting a specific database using a USE statement.
  • Database-bound connections are recommended for use-cases where you want to manage multiple schemas in a single database at once.

Create a file named db-credentials.yaml with the following contents:

db-credentials.yaml
apiVersion: v1
kind: Secret
metadata:
name: mysql-credentials
type: Opaque
stringData:
url: "mysql://user:pass@your.db.dns.default:3306/myapp"

This defines a schema-bound connection to a named database called myapp.

Apply this file to your cluster:

kubectl apply -f db-credentials.yaml

Defining the desired schema

The Atlas Operator relies on a Kubernetes custom resource called AtlasSchema to define the desired schema of your database. You can find the full definition for this resource on the ariga/atlas-operator repo. This resource describes the desired state of a target database that is used in the operator's reconciliation loop.

Let's review an example of such as a resource. Create a file named atlas-schema.yaml with the following contents:

atlas-schema.yaml
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasSchema
metadata:
name: myapp
spec:
# Load the URL of the target database from a Kubernetes secret.
urlFrom:
secretKeyRef:
key: url
name: mysql-credentials
# Define the desired schema of the target database. This can be defined in either
# plain SQL like this example or in Atlas HCL.
schema:
sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique not null,
short_bio varchar(255) not null,
primary key (id)
);
# Define policies that control how the operator will apply changes to the target database.
policy:
# Define a policy that will stop the reconciliation loop if the operator detects
# a destructive change such as dropping a column or table.
lint:
destructive:
error: true
# Define a policy that omits DROP COLUMN changes from any produced plan.
diff:
skip:
drop_column: true
# Exclude a table from being managed by the operator. This is useful for resources
# that belong to other applications and are managed in other ways.
exclude:
- external_table_managed_elsewhere

Apply this file to your cluster:

kubectl apply -f atlas-schema.yaml

Wait for the operator to reconcile the desired state with the actual state of the database:

kubectl wait --for=condition=Ready atlasschema/myapp --timeout=60s

Alternatively, you can use a ConfigMap to store the desired schema of your database. To use this approach, create a ConfigMap with the desired schema:

atlas-schema-configmap.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: mysql-schema
data:
schema.sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) unique not null,
short_bio varchar(255) not null,
primary key (id)
);

Then, update the AtlasSchema resource to reference the config map key:

atlas-schema.yaml
apiVersion: db.atlasgo.io/v1alpha1
kind: AtlasSchema
metadata:
name: myapp
spec:
schema:
configMapKeyRef:
key: schema.sql
name: mysql-schema
# Skipping all other fields for brevity.

Configuration for the AtlasSchema resource

The AtlasSchema resource is used to define the desired state of a target database.

Available fields:

urlFrom

The urlFrom field is used to load the URL of the target database from another resource such as a Kubernetes secret.

urlFrom:
secretKeyRef:
key: url
name: mysql-credentials

url

The url field is used to define the URL of the target database directly. This is not recommended, but may be useful for testing purposes.

url: "mysql://user:pass@localhost:3306/myapp"

schema

The schema field is used to define the desired schema of the target database.

This can be defined in either plain SQL or in Atlas HCL.

schema:
sql: |
create table users (
id int not null auto_increment,
name varchar(255) not null,
email varchar(255) uniuse
);
schema:
hcl: |
schema "myapp" {
}
table "users" {
column "id" {
type = int
}
// ...
}

Alternatively, you can reference a key in a ConfigMap to store the desired schema of your database:

schema:
configMapKeyRef:
key: schema.sql
name: mysql-schema

Note: the .sql suffix on the key denotes that the schema is defined in plain SQL. If you are using Atlas HCL, you should use the .hcl suffix instead.

policy

The policy field is used to define policies that control how the operator will apply changes to the target database.

lint

The lint field is used to define policies that control how the operator will lint the desired schema.

policy:
lint:
destructive:
error: true

Currently, the only lint policy available is destructive. This policy controls what the operator will do if it detects a destructive change such as dropping a column or table.

diff

The diff field is used to define the diff policies that are used to plan the schema changes.

policy:
diff:
skip:
drop_column: true

Currently, the only diff policy available is skip. This policy controls which kind of schema changes will be skipped from the produced plan. The following options are available:

  • add_column, add_foreign_key, add_index, add_schema, add_table, drop_column, drop_foreign_key, drop_index, drop_schema, drop_table, modify_column, modify_foreign_key, modify_index, modify_schema, modify_table

exclude

The exclude field is used to define tables that should be excluded from being managed by the operator. This is useful for resources that belong to other applications and are managed in other ways.

exclude:
- external_table_managed_elsewhere

schemas

The schemas field is used to define schemas that should be managed by the operator. This is used to manage multiple schemas in a single database and works only if the connection string is database-bound.

schemas:
- schema1
- schema2