Skip to main content

Managing ClickHouse Settings with Atlas

Introduction

ClickHouse provides a powerful settings system that allows you to control query execution behavior, performance characteristics, and resource management. These settings can be applied at different levels: globally, per user/profile, per session, or per query.

When managing your ClickHouse schemas with Atlas, you may need to configure specific settings to enable certain features or optimize schema operations.

Method 1: Settings Profiles

ClickHouse allows you to create settings profiles that define a collection of settings for specific users or roles.

This is the recommended approach for persistent configuration that applies to all Atlas operations.

You can create a dedicated profile for Atlas and assign it to the user with which Atlas connects to your database. Settings profiles can be created using either SQL or XML configuration files.

Using SQL:

-- Create a settings profile for Atlas
CREATE SETTINGS PROFILE atlas_profile SETTINGS
allow_suspicious_low_cardinality_types = 1,
max_memory_usage = 10000000000;

-- Assign the profile to the Atlas user
ALTER USER atlas_user SETTINGS PROFILE atlas_profile;

Using XML configuration:

Alternatively, you can define profiles in your ClickHouse configuration file. See the ClickHouse documentation for details on XML configuration.

Method 2: Connection URL Parameters

You can pass settings directly in the Atlas connection URL as query parameters. These settings apply to all queries executed by Atlas during that connection session.

Syntax

clickhouse://user:password@host:port/database?setting_name=value&another_setting=value

Example

atlas schema apply \
--url "clickhouse://user:password@localhost:9000/default?allow_suspicious_low_cardinality_types=1&max_memory_usage=10000000000"

This approach is useful for:

  • Temporary or one-time operations
  • Testing different setting values
  • Environment-specific configurations without modifying the database

Method 3: Versioned Migration Files

When using Atlas with the versioned migration workflow, you can specify settings directly in your SQL migration files.

Option A: Table-level Settings

Apply settings to a specific table using the SETTINGS clause in the DDL statement:

CREATE TABLE users (
id LowCardinality(Int32)
)
ENGINE = MergeTree
PRIMARY KEY (id)
ORDER BY (id)
SETTINGS index_granularity = 8192, allow_suspicious_low_cardinality_types = 1;

Option B: Session-level Settings

Apply settings at the session level for all subsequent statements in the migration file:

SET allow_suspicious_low_cardinality_types = 1;

CREATE TABLE users (
id LowCardinality(Int32)
)
ENGINE = MergeTree
PRIMARY KEY (id)
ORDER BY (id);
info

Session-level settings using SET apply to all statements that follow in the same migration file.

Method 4: Declarative Workflow Configuration

When using Atlas with the declarative workflow, you can configure settings in your atlas.hcl file with a session_settings block. This allows Atlas to automatically apply the necessary settings when generating migration plans.

Configuration Syntax

env "local" {
diff "clickhouse" {
session_settings {
match = "<pattern>"
values = {
<setting_name> = <setting_value>
}
}
}
}

Parameters

  • match: A glob pattern to match table or database names
    • Use "*" to match all objects when managing a single database
    • Use "database.*" to match all tables in a specific database
    • Use "table_name" to match a specific table
  • values: A map of ClickHouse settings and their values
  • Multiple session_settings blocks can be defined for different matching patterns

Examples

Example 1: Managing a Single Database

When Atlas manages a single database (the default scope), use simple table name patterns:

env "production" {
url = "clickhouse://user:pass@localhost:9000/mydb"

diff "clickhouse" {
// Apply to all tables in the database
session_settings {
match = "*"
values = {
allow_suspicious_low_cardinality_types = 1
index_granularity = 8192
}
}

// Apply to a specific table
session_settings {
match = "users"
values = {
allow_suspicious_low_cardinality_types = 1
}
}

// Apply to tables matching a pattern
session_settings {
match = "events_*"
values = {
max_memory_usage = 10000000000
}
}
}
}

Example 2: Managing Multiple Databases

When Atlas manages multiple databases, use the database.table format in your patterns:

env "production" {
url = "clickhouse://user:pass@localhost:9000"

diff "clickhouse" {
// Apply to all tables in all databases
session_settings {
match = "*.*"
values = {
allow_suspicious_low_cardinality_types = 1
}
}

// Apply to all tables in the 'analytics' database
session_settings {
match = "analytics.*"
values = {
max_memory_usage = 10000000000
}
}

// Apply to a specific table in a specific database
session_settings {
match = "analytics.events"
values = {
index_granularity = 16384
}
}
}
}

Example 3: Combining Multiple Rules

You can define multiple session_settings blocks to handle different requirements:

env "production" {
diff "clickhouse" {
// Global settings for all tables
session_settings {
match = "*"
values = {
max_execution_time = 300
}
}

// Specific settings for tables with experimental types
session_settings {
match = "users"
values = {
allow_suspicious_low_cardinality_types = 1
allow_experimental_variant_type = 1
}
}

// Performance settings for large analytical tables
session_settings {
match = "logs_*"
values = {
max_memory_usage = 20000000000
max_threads = 16
}
}
}
}

Additional Resources