Skip to main content

Monitoring Stack Deployment

Complete guide for deploying and managing the VictoriaMetrics monitoring stack.

Overview

NoETL includes a complete monitoring stack based on VictoriaMetrics, providing:

  • Metrics Collection: VMAgent scrapes metrics from NoETL, Postgres, and Kubernetes
  • Metrics Storage: VMSingle stores time-series data
  • Visualization: Grafana dashboards with pre-configured panels
  • Alerting: VMAlert for metric-based alerts
  • Log Aggregation: VictoriaLogs for centralized logging
  • Service Discovery: Automatic discovery of services in Kubernetes

Quick Start

Deploy the complete monitoring stack:

noetl run automation/infrastructure/monitoring.yaml --set action=deploy

This deploys:

  • VictoriaMetrics Operator
  • VMStack (VMSingle, VMAgent, VMAlert, Grafana)
  • Metrics Server (Kubernetes metrics)
  • Kube State Metrics
  • Node Exporter
  • PostgreSQL Exporter
  • NoETL service monitors

Access Services

All services are accessible via NodePort mappings configured in Kind cluster:

ServiceURLPort Mapping
Grafanahttp://localhost:300030300 → 3000
VictoriaLogshttp://localhost:942830428 → 9428
NoETL APIhttp://localhost:808230082 → 8082
PostgreSQLlocalhost:5432130321 → 54321
ClickHouse HTTPhttp://localhost:3012330123 → 30123
ClickHouse Nativelocalhost:3090030900 → 30900
Qdrant HTTPhttp://localhost:3063330633 → 30633
Qdrant gRPClocalhost:3063430634 → 30634
NATSlocalhost:3042230422 → 30422

Grafana Access

Grafana is configured with authentication disabled for local development:

open http://localhost:3000

No credentials required - you'll have immediate access to all dashboards.

Available Actions

Main Actions

# Deploy complete stack
noetl run automation/infrastructure/monitoring.yaml --set action=deploy

# Check deployment status
noetl run automation/infrastructure/monitoring.yaml --set action=status

# Remove entire stack
noetl run automation/infrastructure/monitoring.yaml --set action=undeploy

Helm Repository Management

# Add all monitoring helm repositories
noetl run automation/infrastructure/monitoring.yaml --set action=add-helm-repos

# Add individual repositories
noetl run automation/infrastructure/monitoring.yaml --set action=add-victoriametrics-helm-repo
noetl run automation/infrastructure/monitoring.yaml --set action=add-vector-helm-repo
noetl run automation/infrastructure/monitoring.yaml --set action=add-metrics-server-helm-repo

Granular Deployment

Deploy components individually:

# Deploy VictoriaMetrics operator
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-vmstack-operator

# Deploy VictoriaMetrics stack
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-vmstack

# Deploy Metrics Server
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-metrics-server

# Deploy PostgreSQL exporter
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-exporter

# Deploy NoETL service scrape config
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-noetl-scrape

# Deploy Vector log collector
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-vector

# Deploy VictoriaLogs
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-vmlogs

# Deploy Grafana dashboards
noetl run automation/infrastructure/monitoring.yaml --set action=deploy-dashboards

Architecture

Components

VictoriaMetrics Operator

  • Namespace: vmoperator
  • Manages VictoriaMetrics custom resources (VMAgent, VMSingle, VMAlert, etc.)
  • Handles service discovery and scrape configuration

VMStack

  • Namespace: vmstack
  • VMSingle: Time-series database
  • VMAgent: Metrics scraper and forwarder
  • VMAlert: Alerting engine
  • Grafana: Visualization and dashboards
  • Kube State Metrics: Kubernetes cluster metrics
  • Node Exporter: Node-level metrics

Additional Exporters

  • PostgreSQL Exporter: Database metrics (namespace: postgres)
  • NoETL Service Monitors: Application-specific metrics

Data Flow

Kubernetes Services → VMAgent (scrapes) → VMSingle (stores) → Grafana (visualizes)

VMAlert (alerts)

Verification

Check Pod Status

# All monitoring components
kubectl get pods -A | grep -E "vm|grafana"

# Operator namespace
kubectl get pods -n vmoperator

# Stack namespace
kubectl get pods -n vmstack

Expected output:

NAMESPACE   NAME                                        READY   STATUS
vmoperator vmoperator-victoria-metrics-operator-... 1/1 Running
vmstack vmagent-vmstack-... 2/2 Running
vmstack vmalertmanager-vmstack-... 2/2 Running
vmstack vmsingle-vmstack-... 1/1 Running
vmstack vmstack-grafana-... 2/2 Running
vmstack vmstack-kube-state-metrics-... 1/1 Running
vmstack vmstack-prometheus-node-exporter-... 1/1 Running

Check Services

kubectl get svc -n vmstack

Test Metrics Collection

# Query VMSingle directly
curl http://localhost:8429/api/v1/query?query=up

# Or through Grafana Explore view
open http://localhost:3000/explore

Dashboards

Pre-configured Grafana dashboards:

NoETL Dashboards

  • NoETL Overview: High-level application metrics
  • NoETL Server: Server-specific metrics
  • NoETL Worker: Worker pool metrics
  • Execution Details: Per-execution metrics

PostgreSQL Dashboards

  • PostgreSQL Overview: Database health and performance
  • PostgreSQL Queries: Query performance and statistics
  • PostgreSQL Connections: Connection pool monitoring

Kubernetes Dashboards

  • Cluster Overview: Node and pod metrics
  • Resource Usage: CPU, memory, disk utilization
  • Network Traffic: Network I/O across cluster

Configuration

Custom Scrape Configs

Add custom service monitors by creating VMServiceScrape resources:

apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: my-service
namespace: vmstack
spec:
selector:
matchLabels:
app: my-service
endpoints:
- port: metrics
path: /metrics

Retention Settings

Modify VMSingle retention (default: 7 days):

kubectl edit vmsingle -n vmstack vmstack-victoria-metrics-k8s-stack

Change retentionPeriod value.

Grafana Configuration

Grafana configuration is managed via Helm values. To customize:

  1. Extract current values:
helm get values vmstack -n vmstack > vmstack-values.yaml
  1. Modify grafana section

  2. Upgrade release:

helm upgrade vmstack vm/victoria-metrics-k8s-stack \
-n vmstack \
-f vmstack-values.yaml

Troubleshooting

Pods Not Starting

Check events:

kubectl get events -n vmstack --sort-by='.lastTimestamp'

Check logs:

kubectl logs -n vmstack -l app.kubernetes.io/name=vmsingle

No Metrics Appearing

  1. Check VMAgent is running:
kubectl logs -n vmstack -l app.kubernetes.io/name=vmagent
  1. Verify service discovery:
kubectl get vmservicescrape -n vmstack
  1. Check scrape targets in Grafana:

Grafana Dashboard Not Loading

  1. Verify Grafana pod is running:
kubectl get pods -n vmstack | grep grafana
  1. Check Grafana logs:
kubectl logs -n vmstack -l app.kubernetes.io/name=grafana
  1. Restart Grafana:
kubectl rollout restart deployment/vmstack-grafana -n vmstack

Port Mapping Issues

Port mappings are defined in ci/kind/config.yaml. Changes require cluster recreation:

# Delete cluster
kind delete cluster --name noetl

# Recreate with new config
kind create cluster --config ci/kind/config.yaml

Cleanup

Remove Individual Components

# Remove dashboards
noetl run automation/infrastructure/monitoring.yaml --set action=remove-dashboards

# Remove exporters
noetl run automation/infrastructure/monitoring.yaml --set action=remove-exporter

# Remove VMStack
noetl run automation/infrastructure/monitoring.yaml --set action=remove-vmstack

# Remove operator
noetl run automation/infrastructure/monitoring.yaml --set action=remove-vmstack-operator

Complete Removal

noetl run automation/infrastructure/monitoring.yaml --set action=undeploy

This removes:

  • All VMStack components
  • VictoriaMetrics operator
  • Metrics Server
  • Custom resource definitions
  • Namespaces (vmstack, vmoperator)

Production Considerations

For production deployments:

  1. Enable Authentication: Configure Grafana with proper authentication
  2. TLS/HTTPS: Enable TLS for all service endpoints
  3. Persistence: Configure persistent storage for metrics
  4. High Availability: Deploy VMCluster instead of VMSingle
  5. Resource Limits: Set appropriate CPU/memory limits
  6. Backup: Configure regular backups for metrics data
  7. Alerting: Set up alert receivers (email, Slack, PagerDuty)
  8. Retention: Adjust retention period based on requirements

Example production Helm values:

vmsingle:
retentionPeriod: "30d"
persistence:
enabled: true
storageClassName: "fast-ssd"
size: 100Gi
resources:
limits:
cpu: "2"
memory: "4Gi"

grafana:
admin:
existingSecret: grafana-admin-secret
ingress:
enabled: true
tls:
- secretName: grafana-tls
hosts:
- grafana.example.com