Skip to main content

Observability Services Integration Summary

Overview

Added complete observability stack to NoETL with ClickHouse, Qdrant, and NATS JetStream, with unified activation/deactivation controls.

Files Created (20 total)

ClickHouse (7 files)

  1. ci/manifests/clickhouse/namespace.yaml
  2. ci/manifests/clickhouse/crds.yaml
  3. ci/manifests/clickhouse/operator.yaml
  4. ci/manifests/clickhouse/clickhouse-cluster.yaml
  5. ci/manifests/clickhouse/mcp-server.yaml
  6. ci/manifests/clickhouse/observability-schema.yaml
  7. ci/manifests/clickhouse/README.md

Qdrant (3 files)

  1. ci/manifests/qdrant/namespace.yaml
  2. ci/manifests/qdrant/qdrant.yaml
  3. ci/manifests/qdrant/README.md

NATS (3 files)

  1. ci/manifests/nats/namespace.yaml
  2. ci/manifests/nats/nats.yaml
  3. ci/manifests/nats/README.md

Taskfiles (4 files)

  1. ci/taskfile/clickhouse.yml - 16 ClickHouse tasks
  2. ci/taskfile/qdrant.yml - 11 Qdrant tasks
  3. ci/taskfile/nats.yml - 12 NATS tasks
  4. ci/taskfile/observability.yml - 11 unified control tasks

Documentation (3 files)

  1. docs/observability_services.md - Complete guide
  2. docs/clickhouse_observability.md - ClickHouse usage guide
  3. docs/clickhouse_integration_summary.md - Implementation summary

Integration Updates

Main Taskfile (taskfile.yml)

  • Added clickhouse:, qdrant:, nats:, observability: includes
  • Changed bootstrap to use observability:activate-all
  • Tasks now namespaced (e.g., task clickhouse:deploy)

Bootstrap (ci/bootstrap/Taskfile-bootstrap.yml)

  • Updated bootstrap:verify to check all observability services
  • Updated dev:start to activate all services and display all ports

Copilot Instructions (.github/copilot-instructions.md)

  • Added observability stack overview
  • Updated development workflows with observability commands

Services

ClickHouse

Purpose: OLAP database for logs, metrics, traces

Access:

  • HTTP: localhost:30123 (NodePort)
  • Native: localhost:30900 (NodePort)
  • MCP: localhost:8124

Features:

  • OpenTelemetry schema (logs, metrics, traces, noetl_events)
  • Materialized views for analytics
  • ZSTD compression, bloom filter indexes
  • 30-90 day TTL policies

Resources: 512Mi-2Gi memory, 500m-2000m CPU, 6Gi storage

Qdrant

Purpose: Vector database for embeddings and semantic search

Access:

  • HTTP: localhost:30633 (NodePort)
  • gRPC: localhost:30634 (NodePort)

Features:

  • Vector similarity search
  • Extended filtering
  • On-disk payload storage
  • Collection-based organization

Resources: 512Mi-2Gi memory, 250m-1000m CPU, 5Gi storage

NATS JetStream

Purpose: Messaging and key-value store

Access:

  • Client: localhost:30422 (NodePort)
  • Monitoring: localhost:30822 (NodePort)

Features:

  • Stream persistence (5GB)
  • Key-value store
  • Message replay
  • Consumer groups
  • Default credentials: noetl/noetl

Resources: 512Mi-2Gi memory, 250m-1000m CPU, 5Gi storage

Task Usage

Unified Control

# Activate all services
task observability:activate-all

# Deactivate all services
task observability:deactivate-all

# Status check
task observability:status-all

# Health check
task observability:health-all

# Restart all
task observability:restart-all

Individual Service Control

# ClickHouse
task clickhouse:deploy
task clickhouse:status
task clickhouse:connect
task clickhouse:query -- "SELECT 1"
task clickhouse:health
task clickhouse:logs

# Qdrant
task qdrant:deploy
task qdrant:status
task qdrant:health
task qdrant:collections
task qdrant:logs

# NATS
task nats:deploy
task nats:status
task nats:health
task nats:streams
task nats:logs

Port Forwarding

task clickhouse:port-forward    # HTTP:8123, Native:9000
task clickhouse:port-forward-mcp # MCP:8124
task qdrant:port-forward # HTTP:6333, gRPC:6334
task nats:port-forward # Client:4222, Monitoring:8222

Bootstrap Integration

Automatic Deployment

All services automatically deploy with:

task bootstrap
# or
task dev:start
# or
task bring-all

Verification

task bootstrap:verify

Shows deployment status for all observability services.

Service Endpoints

After dev:start, shows all endpoints:

NoETL started!
UI: http://localhost:8083
Grafana: kubectl port-forward -n vmstack svc/vmstack-grafana 3000:80
ClickHouse HTTP: localhost:30123 (NodePort)
ClickHouse Native: localhost:30900 (NodePort)
Qdrant HTTP: localhost:30633 (NodePort)
Qdrant gRPC: localhost:30634 (NodePort)
NATS Client: localhost:30422 (NodePort)
NATS Monitoring: localhost:30822 (NodePort)

Total Resources

Memory: ~3.5-6Gi (1.5-2Gi per service) CPU: ~2-4 cores (~1 core per service) Storage: ~16Gi (ClickHouse 6Gi, Qdrant 5Gi, NATS 5Gi)

Task Count Summary

  • ClickHouse: 23 tasks
  • Qdrant: 11 tasks
  • NATS: 12 tasks
  • Observability: 11 unified tasks
  • Total: 57 observability tasks

Quick Reference

Health Checks

# All services
task observability:health-all

# Individual
curl http://localhost:30123 # ClickHouse HTTP
clickhouse-client --host localhost --port 30900 # ClickHouse Native
curl http://localhost:30633 # Qdrant HTTP
curl http://localhost:30822/healthz # NATS Monitoring

Logs

task clickhouse:logs
task qdrant:logs
task nats:logs

Restart

task observability:restart-all
# or individual: clickhouse:restart, qdrant:restart, nats:restart

API Examples

ClickHouse Query

task clickhouse:query -- "SELECT COUNT(*) FROM observability.logs"

Qdrant Create Collection

curl -X PUT "http://localhost:30633/collections/embeddings" \
-H "Content-Type: application/json" \
-d '{"vectors": {"size": 384, "distance": "Cosine"}}'

NATS Publish

nats -s nats://noetl:noetl@localhost:30422 pub events.test "message"

Documentation References

  • docs/observability_services.md - Complete guide with examples
  • docs/clickhouse_observability.md - ClickHouse detailed usage
  • ci/manifests/clickhouse/README.md - ClickHouse manifests
  • ci/manifests/qdrant/README.md - Qdrant manifests
  • ci/manifests/nats/README.md - NATS manifests

Testing

Verify deployment:

# Deploy all
task observability:activate-all

# Check status
task observability:status-all

# Health check
task observability:health-all

# Test ClickHouse
task clickhouse:test

# Test Qdrant
task qdrant:test

# Test NATS
task nats:test

Cleanup

Remove all services:

task observability:deactivate-all

Next Steps

  1. Configure OpenTelemetry collector to send data to ClickHouse
  2. Create Grafana dashboards for observability data
  3. Implement vector embedding pipeline for Qdrant
  4. Set up NATS streams for NoETL events
  5. Add automated backup procedures
  6. Configure production-grade cluster settings