Skip to main content

Test Infrastructure Setup Guide

This guide covers the infrastructure components, setup procedures, and configuration requirements for the NoETL test environment.

Infrastructure Overview

The NoETL test infrastructure consists of multiple interconnected components:

NoETL Test Infrastructure
├── Core Services
│ ├── NoETL Server (API Gateway)
│ ├── PostgreSQL Database (State Storage)
│ └── Worker Processes (Job Execution)
├── External Services (Integration Tests)
│ ├── Weather API (Open-Meteo)
│ ├── Google Cloud Storage (GCS)
│ └── Cloud Authentication Services
└── Development Tools
├── Docker Compose (Local Development)
├── Kubernetes (Production-like Testing)
└── Make Automation (Build & Test)

Component Architecture

Core Services

NoETL Server

  • Purpose: Main API gateway and workflow orchestration
  • Port: 8082 (default)
  • Dependencies: PostgreSQL database
  • Health Check: GET /health
  • Configuration: Environment variables and config files

PostgreSQL Database

  • Purpose: Persistent storage for workflow state, events, and results
  • Port: 5432 (default)
  • Schema: Auto-initialized with NoETL tables
  • Test Database: noetl (can be reset for clean state)

Worker Processes

  • Purpose: Asynchronous job execution and processing
  • Count: Configurable (default: 2 CPU workers, 1 GPU worker)
  • Communication: Database queue-based task distribution
  • Scaling: Horizontal scaling supported

External Services (Integration Testing)

Weather API (Open-Meteo)

  • Endpoint: https://api.open-meteo.com/v1/forecast
  • Purpose: Real HTTP API integration testing
  • Authentication: None required (public API)
  • Rate Limits: Reasonable for testing (no aggressive throttling)

Google Cloud Storage (GCS)

  • Purpose: Cloud storage output validation
  • Authentication: HMAC keys or Service Account
  • Bucket: Test-specific bucket (noetl-test-bucket)
  • Format: Parquet file uploads

Pagination Test Server

  • Purpose: HTTP pagination pattern testing
  • Deployment: Kubernetes (kind cluster)
  • Endpoints:
    • ClusterIP: http://paginated-api.test-server.svc.cluster.local:5555
    • NodePort: http://localhost:30555
  • Patterns Supported:
    • Page-based pagination (/api/v1/assessments)
    • Offset-based pagination (/api/v1/users)
    • Cursor-based pagination (/api/v1/events)
    • Flaky endpoint for retry testing (/api/v1/flaky)
  • Management: Via automation playbook (see Automation Playbooks)

Quick Setup:

# Full deployment
noetl run automation/test/pagination-server.yaml --set action=full

# Check status
noetl run automation/test/pagination-server.yaml --set action=status

# Test endpoints
noetl run automation/test/pagination-server.yaml --set action=test

Setup Procedures

1. Local Development Setup

Prerequisites

# Required tools
- Python 3.11+
- Docker & Docker Compose
- Git
- Make
- curl & jq (for API testing)

# Optional tools (recommended)
- kind (Kubernetes in Docker)
- kubectl (Kubernetes CLI)
- pgcli (PostgreSQL CLI)

Quick Setup

# 1. Clone repository
git clone <repository-url>
cd noetl

# 2. Install dependencies
make install-uv # Install uv package manager
make create-venv # Create virtual environment
make install-dev # Install development dependencies

# 3. Start services
make up # Start all services via Docker Compose

# 4. Verify installation
make status # Check all service status
make test # Run static tests

Detailed Setup Steps

# Create and activate virtual environment
make create-venv
source .venv/bin/activate

# Install NoETL package in development mode
make install-dev

# Verify installation
.venv/bin/noetl --help

# Start PostgreSQL
make postgres-start

# Initialize database schema
make postgres-reset-schema

# Start NoETL server
make noetl-start

# Start worker processes
make worker-start

# Verify all services are running
make server-status # Should return health check
make postgres-status # Should show connection success

2. Docker Compose Setup

Service Configuration

# docker-compose.yaml (key services)
services:
postgres:
image: postgres:15
environment:
POSTGRES_DB: noetl
POSTGRES_USER: noetl
POSTGRES_PASSWORD: noetl
ports:
- "5432:5432"

noetl-server:
build: .
depends_on:
- postgres
environment:
DATABASE_URL: postgresql://noetl:noetl@postgres:5432/noetl
ports:
- "8082:8082"

Docker Commands

# Start all services
make up

# Start specific service
docker-compose up postgres
docker-compose up noetl-server

# View logs
docker-compose logs -f noetl-server
docker-compose logs -f postgres

# Reset environment
make down && make up

3. Kubernetes Setup (Advanced)

Kind Cluster Setup

# Create local Kubernetes cluster
make k8s-kind-create

# Verify cluster
kubectl cluster-info
kubectl get nodes

# Deploy NoETL services
make k8s-postgres-apply
make k8s-noetl-apply

# Port forward for testing
make postgres-port-forward # PostgreSQL on localhost:5432
kubectl port-forward svc/noetl-server 8082:8082

Kubernetes Testing

# Kubernetes-friendly tests (no DB reset)
make test-control-flow-workbook-k8s
make test-playbook-composition-k8s

# Check pod status
kubectl get pods -n noetl

# View pod logs
kubectl logs -f deployment/noetl-server

Configuration Management

Environment Variables

Core Configuration

# Database connection
export DATABASE_URL=postgresql://noetl:noetl@localhost:5432/noetl
export POSTGRES_HOST=localhost
export POSTGRES_PORT=5432
export POSTGRES_DB=noetl
export POSTGRES_USER=noetl
export POSTGRES_PASSWORD=noetl

# Server configuration
export NOETL_HOST=0.0.0.0
export NOETL_PORT=8082
export NOETL_LOG_LEVEL=INFO
export NOETL_WORKER_COUNT=2

# Test configuration
export NOETL_RUNTIME_TESTS=true
export NOETL_TEST_TIMEOUT=300

Cloud Service Configuration

# Google Cloud Storage
export GCS_BUCKET=noetl-test-bucket
export GCS_PROJECT_ID=noetl-test-project
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# AWS (if using S3)
export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_DEFAULT_REGION=us-west-2
export S3_BUCKET=noetl-test-bucket

Configuration Files

NoETL Server Config

# config/server.yaml
server:
host: 0.0.0.0
port: 8082
workers: 2

database:
url: ${DATABASE_URL}
pool_size: 20
max_overflow: 10

logging:
level: INFO
format: json

auth:
enabled: false # Disabled for testing

Test Configuration

# config/test.yaml
test:
timeout: 300
parallel: true
cleanup: true

database:
reset_schema: true
preserve_data: false

external_services:
weather_api: true
gcs_storage: true
timeout: 60

Credential Management

Test Credentials Setup

PostgreSQL Credential

// tests/fixtures/credentials/pg_local.json
{
"name": "pg_local",
"type": "postgres",
"connection_string": "postgresql://noetl:noetl@localhost:5432/noetl",
"description": "Local PostgreSQL for testing"
}

GCS HMAC Credential

// tests/fixtures/credentials/gcs_hmac_local.json
{
"name": "gcs_hmac_local",
"type": "gcs_hmac",
"access_key": "GOOG1E...",
"secret_key": "your-secret-key",
"bucket": "noetl-test-bucket",
"description": "GCS HMAC for testing"
}

Credential Registration

# Register all test credentials
make register-test-credentials

# Register specific credential
make register-credential FILE=tests/fixtures/credentials/pg_local.json

# Verify registration
curl -s http://localhost:8082/api/credentials | jq '.items[].name'

Cloud Service Setup

Google Cloud Storage

# 1. Create GCS bucket
gsutil mb gs://noetl-test-bucket

# 2. Generate HMAC keys
gsutil hmac create [email protected]

# 3. Set bucket permissions
gsutil iam ch serviceAccount:[email protected]:objectAdmin gs://noetl-test-bucket

# 4. Configure credential
# Use the HMAC keys in tests/fixtures/credentials/gcs_hmac_local.json

AWS S3 (Alternative)

# 1. Create S3 bucket
aws s3 mb s3://noetl-test-bucket

# 2. Create IAM user with S3 permissions
aws iam create-user --user-name noetl-test-user
aws iam attach-user-policy --user-name noetl-test-user --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess

# 3. Generate access keys
aws iam create-access-key --user-name noetl-test-user

Database Management

Schema Management

# Reset database schema (clean slate)
make postgres-reset-schema

# Apply migrations
make postgres-migrate

# Backup test data
make postgres-backup

# Restore test data
make postgres-restore

Database Utilities

# Connect to database
make postgres-connect

# Run SQL query
make postgres-query SQL="SELECT * FROM executions LIMIT 5"

# Check database status
make postgres-status

# View database logs
docker-compose logs postgres

Test Data Management

-- Clean test data
DELETE FROM executions WHERE name LIKE '%test%';
DELETE FROM queue_items WHERE created_at < NOW() - INTERVAL '1 hour';

-- Check test execution status
SELECT execution_id, name, status, created_at
FROM executions
WHERE name IN ('control_flow_workbook', 'http_duckdb_postgres', 'playbook_composition')
ORDER BY created_at DESC;

-- Monitor queue status
SELECT status, COUNT(*)
FROM queue_items
GROUP BY status;

Monitoring and Debugging

Health Checks

# Service health checks
curl http://localhost:8082/health # Server health
curl http://localhost:8082/api/status # API status
pg_isready -h localhost -p 5432 # PostgreSQL health

# Comprehensive status check
make status # All services

Log Management

# Server logs
tail -f logs/server.log

# Worker logs
tail -f logs/worker_*.log

# Event logs (structured)
tail -f logs/event.json | jq .

# Error analysis
grep -i error logs/server.log
jq 'select(.level == "ERROR")' logs/event.json

Performance Monitoring

# Resource usage
docker stats # Container resources
htop # System resources
iotop # I/O usage

# Database performance
make postgres-stats # Database statistics
EXPLAIN ANALYZE SELECT * FROM executions; # Query performance

Debug Tools

# Interactive debugging
pytest --pdb tests/test_control_flow_workbook.py

# Verbose test output
pytest -v -s tests/

# Profile test performance
pytest --profile tests/

# Generate test coverage
pytest --cov=noetl --cov-report=html tests/

Troubleshooting

Common Issues

Port Conflicts

# Symptom: "Port 8082 already in use"
# Solution: Kill existing processes
lsof -ti:8082 | xargs kill -9
make noetl-restart

Database Connection Issues

# Symptom: "could not connect to server"
# Solution: Check PostgreSQL status
docker-compose ps postgres # Check container status
make postgres-status # Test connection
make postgres-reset-schema # Reset if corrupted

Missing Dependencies

# Symptom: "ModuleNotFoundError"
# Solution: Reinstall dependencies
make install-dev # Reinstall all dependencies
source .venv/bin/activate # Ensure venv activated

Credential Issues

# Symptom: "credential 'pg_local' not found"
# Solution: Register credentials
make register-test-credentials # Register all credentials
curl http://localhost:8082/api/credentials # Verify registration

Recovery Procedures

Complete Environment Reset

# Nuclear option: reset everything
make down # Stop all services
docker system prune -af --volumes # Clean Docker
make postgres-reset-schema # Reset database
make up # Restart services
make register-test-credentials # Re-register credentials
make test # Verify functionality

Service-Specific Reset

# Reset NoETL server only
make noetl-restart

# Reset PostgreSQL only
make postgres-restart
make postgres-reset-schema

# Reset workers only
make worker-restart

Performance Tuning

Database Optimization

-- PostgreSQL tuning for tests
-- Add to postgresql.conf or docker environment

shared_buffers = 256MB
work_mem = 4MB
maintenance_work_mem = 64MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100

Service Optimization

# Optimize for testing
export NOETL_WORKER_COUNT=1 # Reduce workers for testing
export NOETL_LOG_LEVEL=WARNING # Reduce log verbosity
export NOETL_DB_POOL_SIZE=10 # Optimize connection pooling

Test Performance

# Parallel test execution
pytest -n auto tests/ # Auto-detect CPU cores
pytest -n 4 tests/ # Specific parallel count

# Test result caching
pytest --cache-clear # Clear stale cache
pytest --lf # Run last failed tests only