Observability Services Integration Summary
Overview
Added complete observability stack to NoETL with ClickHouse, Qdrant, and NATS JetStream, with unified activation/deactivation controls.
Files Created (20 total)
ClickHouse (7 files)
ci/manifests/clickhouse/namespace.yamlci/manifests/clickhouse/crds.yamlci/manifests/clickhouse/operator.yamlci/manifests/clickhouse/clickhouse-cluster.yamlci/manifests/clickhouse/mcp-server.yamlci/manifests/clickhouse/observability-schema.yamlci/manifests/clickhouse/README.md
Qdrant (3 files)
ci/manifests/qdrant/namespace.yamlci/manifests/qdrant/qdrant.yamlci/manifests/qdrant/README.md
NATS (3 files)
ci/manifests/nats/namespace.yamlci/manifests/nats/nats.yamlci/manifests/nats/README.md
Taskfiles (4 files)
ci/taskfile/clickhouse.yml- 16 ClickHouse tasksci/taskfile/qdrant.yml- 11 Qdrant tasksci/taskfile/nats.yml- 12 NATS tasksci/taskfile/observability.yml- 11 unified control tasks
Documentation (3 files)
docs/observability_services.md- Complete guidedocs/clickhouse_observability.md- ClickHouse usage guidedocs/clickhouse_integration_summary.md- Implementation summary
Integration Updates
Main Taskfile (taskfile.yml)
- Added
clickhouse:,qdrant:,nats:,observability:includes - Changed bootstrap to use
observability:activate-all - Tasks now namespaced (e.g.,
task clickhouse:deploy)
Bootstrap (ci/bootstrap/Taskfile-bootstrap.yml)
- Updated
bootstrap:verifyto check all observability services - Updated
dev:startto activate all services and display all ports
Copilot Instructions (.github/copilot-instructions.md)
- Added observability stack overview
- Updated development workflows with observability commands
Services
ClickHouse
Purpose: OLAP database for logs, metrics, traces
Access:
- HTTP:
localhost:30123(NodePort) - Native:
localhost:30900(NodePort) - MCP:
localhost:8124
Features:
- OpenTelemetry schema (logs, metrics, traces, noetl_events)
- Materialized views for analytics
- ZSTD compression, bloom filter indexes
- 30-90 day TTL policies
Resources: 512Mi-2Gi memory, 500m-2000m CPU, 6Gi storage
Qdrant
Purpose: Vector database for embeddings and semantic search
Access:
- HTTP:
localhost:30633(NodePort) - gRPC:
localhost:30634(NodePort)
Features:
- Vector similarity search
- Extended filtering
- On-disk payload storage
- Collection-based organization
Resources: 512Mi-2Gi memory, 250m-1000m CPU, 5Gi storage
NATS JetStream
Purpose: Messaging and key-value store
Access:
- Client:
localhost:30422(NodePort) - Monitoring:
localhost:30822(NodePort)
Features:
- Stream persistence (5GB)
- Key-value store
- Message replay
- Consumer groups
- Default credentials: noetl/noetl
Resources: 512Mi-2Gi memory, 250m-1000m CPU, 5Gi storage
Task Usage
Unified Control
# Activate all services
task observability:activate-all
# Deactivate all services
task observability:deactivate-all
# Status check
task observability:status-all
# Health check
task observability:health-all
# Restart all
task observability:restart-all
Individual Service Control
# ClickHouse
task clickhouse:deploy
task clickhouse:status
task clickhouse:connect
task clickhouse:query -- "SELECT 1"
task clickhouse:health
task clickhouse:logs
# Qdrant
task qdrant:deploy
task qdrant:status
task qdrant:health
task qdrant:collections
task qdrant:logs
# NATS
task nats:deploy
task nats:status
task nats:health
task nats:streams
task nats:logs
Port Forwarding
task clickhouse:port-forward # HTTP:8123, Native:9000
task clickhouse:port-forward-mcp # MCP:8124
task qdrant:port-forward # HTTP:6333, gRPC:6334
task nats:port-forward # Client:4222, Monitoring:8222
Bootstrap Integration
Automatic Deployment
All services automatically deploy with:
task bootstrap
# or
task dev:start
# or
task bring-all
Verification
task bootstrap:verify
Shows deployment status for all observability services.
Service Endpoints
After dev:start, shows all endpoints:
NoETL started!
UI: http://localhost:8083
Grafana: kubectl port-forward -n vmstack svc/vmstack-grafana 3000:80
ClickHouse HTTP: localhost:30123 (NodePort)
ClickHouse Native: localhost:30900 (NodePort)
Qdrant HTTP: localhost:30633 (NodePort)
Qdrant gRPC: localhost:30634 (NodePort)
NATS Client: localhost:30422 (NodePort)
NATS Monitoring: localhost:30822 (NodePort)
Total Resources
Memory: ~3.5-6Gi (1.5-2Gi per service) CPU: ~2-4 cores (~1 core per service) Storage: ~16Gi (ClickHouse 6Gi, Qdrant 5Gi, NATS 5Gi)
Task Count Summary
- ClickHouse: 23 tasks
- Qdrant: 11 tasks
- NATS: 12 tasks
- Observability: 11 unified tasks
- Total: 57 observability tasks
Quick Reference
Health Checks
# All services
task observability:health-all
# Individual
curl http://localhost:30123 # ClickHouse HTTP
clickhouse-client --host localhost --port 30900 # ClickHouse Native
curl http://localhost:30633 # Qdrant HTTP
curl http://localhost:30822/healthz # NATS Monitoring
Logs
task clickhouse:logs
task qdrant:logs
task nats:logs
Restart
task observability:restart-all
# or individual: clickhouse:restart, qdrant:restart, nats:restart
API Examples
ClickHouse Query
task clickhouse:query -- "SELECT COUNT(*) FROM observability.logs"
Qdrant Create Collection
curl -X PUT "http://localhost:30633/collections/embeddings" \
-H "Content-Type: application/json" \
-d '{"vectors": {"size": 384, "distance": "Cosine"}}'
NATS Publish
nats -s nats://noetl:noetl@localhost:30422 pub events.test "message"
Documentation References
docs/observability_services.md- Complete guide with examplesdocs/clickhouse_observability.md- ClickHouse detailed usageci/manifests/clickhouse/README.md- ClickHouse manifestsci/manifests/qdrant/README.md- Qdrant manifestsci/manifests/nats/README.md- NATS manifests
Testing
Verify deployment:
# Deploy all
task observability:activate-all
# Check status
task observability:status-all
# Health check
task observability:health-all
# Test ClickHouse
task clickhouse:test
# Test Qdrant
task qdrant:test
# Test NATS
task nats:test
Cleanup
Remove all services:
task observability:deactivate-all
Next Steps
- Configure OpenTelemetry collector to send data to ClickHouse
- Create Grafana dashboards for observability data
- Implement vector embedding pipeline for Qdrant
- Set up NATS streams for NoETL events
- Add automated backup procedures
- Configure production-grade cluster settings