Skip to main content

DuckLake Tool (Canonical v10)

The ducklake tool executes DuckDB SQL with a PostgreSQL-backed metastore so multiple workers can read/write concurrently without DuckDB file locking conflicts.

Use it when you need:

  • concurrent multi-worker writes/reads to the same logical tables
  • snapshot/time-travel semantics (implementation-defined)
  • shared storage-backed data path

Basic usage

- step: create_table
tool:
- ddl:
kind: ducklake
# Either provide an explicit connection string...
# catalog_connection: "postgresql://user:pass@host:5432/ducklake_catalog"
# ...or provide unified auth (runtime-defined)
auth:
source: credential
key: ducklake_catalog_pg
service: postgres
catalog_name: analytics
data_path: /opt/noetl/data/ducklake
command: |
CREATE TABLE IF NOT EXISTS users(id INTEGER, email VARCHAR);
INSERT INTO users VALUES (1, '[email protected]');
spec:
policy:
rules:
- when: "{{ outcome.status == 'error' }}"
then: { do: fail }
- else:
then: { do: break }

Required fields

FieldMeaning
catalog_connectionPostgres connection string for the metastore (optional if auth is provided)
authUnified auth config resolved to a Postgres connection (optional if catalog_connection is provided)
catalog_nameDuckLake catalog name
data_pathShared path for data files (RWX storage in k8s)
command or commandsSQL to execute

See also

  • DuckDB tool: documentation/docs/reference/tools/duckdb.md
  • Retry semantics: documentation/docs/reference/retry_mechanism_v2.md