Skip to main content

Architecture

NoETL uses a server-worker architecture for distributed workflow execution.

Component Overview

NoETL Components

Components

Gateway

Rust-based API gateway for external clients:

  • Exposes GraphQL API for playbook execution
  • Provides REST API for Auth0 authentication (/api/auth/*)
  • Session validation middleware with NATS K/V caching
  • Pure gateway design - no direct database connections
  • All data access through Control Plane API
  • NATS K/V for fast session lookups (sub-millisecond)
  • Future: WebSocket subscriptions via NATS for live updates

Session Caching: Gateway checks NATS K/V for cached sessions before calling auth playbooks. On cache miss, playbooks validate from PostgreSQL and refresh the cache.

NoETL Control Plane

Central coordination service:

  • Exposes REST APIs for catalog, credentials, executions, and events
  • Schedules and supervises workflow executions
  • Publishes task notifications to NATS JetStream
  • Receives execution events from workers
  • Manages retries and backoff policies
  • Reconstructs workflow state from event table
  • Used by CLIs, UIs, and integrations

Event Store and Projections

NoETL's current event-sourced runtime uses PostgreSQL as the authoritative event and projection store:

  • noetl.event is the append-only execution history.
  • Projection tables such as executions, commands, stages, frames, and runtime state are rebuildable from events.
  • Replay validation reads the event stream and checks projected runtime state without depending on worker memory.
  • NATS JetStream is used for command notification and worker delivery, not as the authoritative event store in the current implementation.

The broader event-store abstraction described in the architecture roadmap keeps the same separation of concerns but allows future deployments to bind the event stream to NATS JetStream, Kafka, Pub/Sub, Event Hubs, Kinesis, or MSK, and bind projection state to PostgreSQL or cloud-native/document/analytic stores. The DSL and playbooks stay backend-neutral; infrastructure configuration selects the adapters.

Worker Pools

Stateless background executors:

  • Subscribe to NATS JetStream for task notifications
  • Retrieve task details via Control Plane API
  • Run workflow steps and tools (HTTP, SQL, Python, etc.)
  • Report events back via Control Plane API
  • Scale horizontally based on load
  • Isolated execution environments

NATS JetStream

Message broker for task distribution:

  • Control Plane publishes task notifications to NATS streams
  • Workers subscribe and acknowledge messages
  • Messages contain pointers to Control Plane API for task details
  • Durable subscriptions ensure no task loss
  • Supports multiple worker pools and load balancing

Result References and Shared Cache

Workers do not push large payloads through the event stream. Task outcomes use a reference-first model:

  • Small values may be inline.
  • Larger results are stored behind a ResultRef / TempRef with a logical noetl://... URI.
  • The durable reference remains the source of truth and can resolve from KV, disk, S3-compatible storage, GCS, or PostgreSQL depending on tier and runtime configuration.
  • Tabular cursor-frame payloads can be serialized as Apache Arrow IPC (application/vnd.apache.arrow.stream).
  • Co-located producer/consumer workers can attach to an optional same-node shared-memory ipc hint on the reference. This is Tier 1.5: a best-effort acceleration path, not authoritative state.
  • If the IPC hint is missing, expired, evicted, or belongs to a different node, resolution falls back to the durable ResultRef.

The live Phase 3 IPC proof validated this path end to end: a cursor frame wrote Arrow IPC bytes, emitted an IPC hint, read via shared memory, evicted the hint, then successfully read the same rows through durable fallback.

Catalog & Credentials

Storage for workflow definitions:

  • Catalog: Playbooks, versions, schemas, tool definitions
  • Credentials: Connection configs and tokens with scoped access

Event Bus and Telemetry

Observability infrastructure:

  • Every step emits structured events (start/finish/errors, durations)
  • Events exported to analytics backends (ClickHouse, VictoriaMetrics)
  • Vector stores (Qdrant) for AI-assisted optimization and semantic search

Storage/Compute Integrations

Connectors for external systems:

  • Warehouses: DuckDB, PostgreSQL, ClickHouse, Snowflake
  • Files/Lakes: GCS, S3, local filesystem
  • Vector DBs: Qdrant
  • External services: HTTP APIs

Data Flow

┌─────────────┐ ┌─────────────┐ ┌───────────────┐ ┌─────────────┐
│ Web UI │────▶│ Gateway │────▶│ Control Plane │────▶│ NATS │
│ (GraphQL) │ │ (Rust) │ │ (FastAPI) │ │ JetStream │
└─────────────┘ └─────────────┘ └───────┬───────┘ └──────┬──────┘
│ │
┌─────────────┐ │ ┌─────▼─────┐
│ CLI/API │─────────────────────────────────┘ │ Workers │
│ (Direct) │ │◀─────────────│ (Execute) │
└─────────────┘ │ (events) └───────────┘

┌─────────────┐
│ PostgreSQL │
│ (Events) │
└─────────────┘
  1. Web UI sends GraphQL requests to Gateway
  2. Gateway authenticates and forwards to Control Plane API
  3. CLI/API can also call Control Plane directly
  4. Control Plane validates, creates execution, publishes task to NATS
  5. Workers receive NATS message with task pointer
  6. Workers fetch task details from Control Plane API
  7. Workers execute steps and report events to Control Plane API
  8. Control Plane stores events in PostgreSQL noetl.event table
  9. Control Plane monitors events to determine next steps in workflow
  10. Large results are stored by reference; events carry ResultRef metadata
  11. Cursor frames may attach an optional Arrow IPC shared-memory hint for same-node consumers, while durable storage remains the fallback

Database Schema

The NoETL PostgreSQL schema is intentionally simple - no queue tables:

TablePurpose
catalogPlaybook definitions (path, version, content)
eventExecution events (status, results, errors)
credentialEncrypted credentials
keychainRuntime token cache with TTL
transientExecution-scoped variables
runtimeWorker pool and server registration
scheduleCron/interval scheduled playbooks

Control loop: Control Plane analyzes event table to reconstruct execution state and determine next steps, then publishes tasks to NATS.

Communication Patterns

Control Plane → NATS → Worker

Task distribution via NATS JetStream:

  1. Control Plane publishes task notification to NATS stream
  2. Message contains execution_id and task pointer (not full payload)
  3. Worker subscribes, receives message, acknowledges
  4. Worker calls Control Plane API to get full task context
  5. Worker executes and reports events to Control Plane API

Event-Driven State

All execution state is persisted as events in PostgreSQL:

  • Server reconstructs workflow state from noetl.event table
  • Determines which steps completed, which are pending
  • Publishes next tasks to NATS based on workflow graph
  • Enables replay, debugging, and distributed execution

Scaling

Horizontal Scaling

  • Workers: Add more worker replicas for throughput
  • Server: Single server coordinates all executions
  • Database: PostgreSQL handles concurrent access

Resource Pools

Configure worker pools for different resource types:

  • CPU-intensive workloads
  • GPU workloads (future)
  • I/O-bound operations

See Also