Back to changelog
New
3 minute read

Snowflake: External Volumes

Atlas now supports Snowflake external volumes as a first-class account-level resource, with declarative management of S3, GCS, S3COMPAT, S3GOV, and Azure storage locations, and seamless references from iceberg_table blocks.

External volumes are Snowflake account-level objects that represent a cloud storage location (S3, GCS, or Azure Blob Storage) used by Iceberg tables for data and metadata storage. Atlas now supports the external_volume block as a first-class resource. You can inspect, diff, and migrate external volumes alongside the rest of your Snowflake schema.

Enabling External Volume Management

External volume management is opt-in. Set external_volumes = true inside the mode "snowflake" block of your env to include external volumes in inspect, diff, and apply:

env "prod" {
url = "snowflake://user:pass@account_identifier/database"
dev = "snowflake://user:pass@dev_account_identifier/database"
schema {
src = "file://schema.hcl"
mode "snowflake" {
external_volumes = true
}
}
}
Important: External volumes are account-level objects and do not belong to any database or schema. Use a separate Snowflake account as the dev database

S3 External Volume

Define an external volume backed by an S3 bucket using the storage_location block with "S3" as the first label and your chosen location name as the second. Supply storage_aws_role_arn for the IAM role Snowflake uses to access the bucket:

external_volume "my_s3_vol" {
storage_location "S3" "s3_us_east" {
storage_base_url = "s3://my-bucket/iceberg/"
storage_aws_role_arn = "arn:aws:iam::123456789012:role/snowflake-role"
storage_aws_external_id = "MY_SNOWFLAKE_SFC_STG_EXTERNALID"
}
allow_writes = true
comment = "S3 external volume for Iceberg tables"
}

Atlas emits a CREATE EXTERNAL VOLUME statement with the storage location inlined:

CREATE EXTERNAL VOLUME "my_s3_vol"
STORAGE_LOCATIONS = (
(
NAME = 's3_us_east'
STORAGE_PROVIDER = 'S3'
STORAGE_BASE_URL = 's3://my-bucket/iceberg/'
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/snowflake-role'
)
)
ALLOW_WRITES = TRUE
COMMENT = 'S3 external volume for Iceberg tables';

Snowflake auto-generates STORAGE_AWS_EXTERNAL_ID when no value is provided. Because Atlas compares the live value against your HCL on every plan, omitting storage_aws_external_id will produce a persistent diff after the first apply. There are two ways to avoid this:

  • Inspect after the first apply. Run atlas schema inspect, then copy the generated storage_aws_external_id value from the output back into your HCL.
  • Pre-define a known value. Set storage_aws_external_id in your HCL before the first apply (as shown in the example above) and Snowflake will use that value instead of generating one.

GCS and Azure

The same two-label block shape works for GCS and Azure. Use "GCS" or "AZURE" as the first label. For GCS, only storage_base_url is required. For Azure, also supply azure_tenant_id for your Active Directory tenant:

external_volume "my_gcs_vol" {
storage_location "GCS" "gcs_eu" {
storage_base_url = "gcs://my-gcs-bucket/iceberg/"
}
allow_writes = true
}
external_volume "my_azure_vol" {
storage_location "AZURE" "azure_westus" {
storage_base_url = "azure://myaccount.blob.core.windows.net/mycontainer/iceberg/"
azure_tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
allow_writes = true
}

Multiple Storage Locations

A single external volume can include multiple storage_location blocks. Each block specifies its own provider, base URL, and credentials:

external_volume "multi_region_vol" {
storage_location "S3" "s3_primary" {
storage_base_url = "s3://primary-bucket/iceberg/"
storage_aws_role_arn = "arn:aws:iam::123456789012:role/snowflake-role"
}
storage_location "S3" "s3_failover" {
storage_base_url = "s3://failover-bucket/iceberg/"
storage_aws_role_arn = "arn:aws:iam::123456789012:role/snowflake-role"
}
allow_writes = true
comment = "Multi-region volume with failover"
}

Referencing an External Volume from an Iceberg Table

Once an external volume is defined, reference it from an iceberg_table block as a direct object reference (external_volume.<name>). Atlas will create the volume before the table and drop it after:

external_volume "my_s3_vol" {
storage_location "S3" "s3_us_east" {
storage_base_url = "s3://my-bucket/iceberg/"
storage_aws_role_arn = "arn:aws:iam::123456789012:role/snowflake-role"
}
allow_writes = true
}
iceberg_table "events" {
schema = schema.PUBLIC
external_volume = external_volume.my_s3_vol
catalog = "SNOWFLAKE"
base_location = "events/"
column "id" {
type = NUMBER(10)
}
column "created_at" {
type = TIMESTAMP_NTZ(6)
}
primary_key {
columns = [column.id]
}
}

For the complete attribute reference, see the external_volume HCL documentation.

featuresnowflakeexternal-volumeicebergaccount-level