Atlas now supports ClickHouse vector similarity indexes, enabling approximate nearest neighbor (ANN) search on vector embeddings using HNSW and other algorithms.

Atlas now supports vector_similarity indexes in ClickHouse. These indexes enable efficient approximate nearest neighbor (ANN) search on vector embedding columns, commonly used for AI/ML workloads like semantic search and recommendation systems.

Defining Vector Similarity Indexes

Use the index block with a vector_similarity type to define a vector index. You can configure the algorithm (e.g., hnsw), distance function (e.g., L2Distance, cosineDistance), and optional quantization parameters.

table "chunks" {
  schema = schema.my_db
  engine = MergeTree
  column "id" {
    null = false
    type = UInt64
  }
  column "embedding" {
    null = false
    type = sql("Array(Float32)")
  }
  index "vec_sim" {
    type        = sql("vector_similarity('hnsw', 'L2Distance', 3)")
    granularity = 100000000
    on {
      expr = "embedding"
    }
  }
  primary_key {
    columns = [column.id]
  }
}

Generated SQL

Atlas generates the appropriate CREATE TABLE statement with the vector similarity index definition:

CREATE TABLE `chunks` (
  `id` UInt64,
  `embedding` Array(Float32),
  INDEX `vec_sim` ((embedding)) TYPE vector_similarity('hnsw', 'L2Distance', 3)
    GRANULARITY 100000000
) ENGINE = MergeTree
PRIMARY KEY (`id`) ORDER BY (`id`) SETTINGS index_granularity = 8192;

Atlas fully manages the lifecycle of vector similarity indexes: creating, dropping, and diffing them alongside other schema objects. This means you can add vector search capabilities to existing tables and Atlas will generate the correct migration statements.

ClickHouse: Vector Similarity Index Support

Defining Vector Similarity Indexes

Generated SQL