Monitoring

Monitor coordinator health, execution throughput, and abuse-control behavior.

Base URL

Replace {COORDINATOR_URL} with your coordinator service URL.

Local Development:

http://localhost:3001/api

Production:

https://your-coordinator-domain.com/api

Health Check

Check coordinator status:

curl {COORDINATOR_URL}/api/health

Response:

{
  "status": "healthy",
  "database": "connected",
  "timestamp": 1777500000000,
  "schedulesRegistered": 4
}

When the database is unavailable, the coordinator responds with 503 and an unhealthy payload.

Logs

Monitor coordinator logs for:

schedule polling
execution run creation
stage-level attempts (delegate, claim, commit)
retries and exhausted runs
schedule registration success / rejection
rate-limit decisions
Redis limiter connectivity and fallback events

Development:

npm run dev

Production:

npm run start

Database Monitoring

Use Drizzle Studio to inspect the coordinator database:

npm run db:studio

Current tables:

schedules - registered schedules, recipient payloads, Merkle proofs
execution_runs - execution-run state for each scheduled payout window
execution_attempts - stage-level attempt history

Metrics

The coordinator exports Prometheus metrics at GET /api/metrics.

By default this endpoint is protected. If METRICS_PUBLIC=false, callers must send:

Authorization: Bearer <METRICS_AUTH_TOKEN>

Example:

curl {COORDINATOR_URL}/api/metrics \
  -H "Authorization: Bearer <METRICS_AUTH_TOKEN>"

Key metrics exposed today:

veil_scheduler_polls_total
veil_schedules_detected_due_total
veil_execution_runs_created_total
veil_execution_stage_total
veil_claim_results_total
veil_api_requests_total
veil_api_request_duration_seconds
veil_api_rate_limit_decisions_total
veil_api_concurrency_limit_decisions_total
veil_api_rate_limit_backend_events_total

Rate-Limit Monitoring

If rate limiting is enabled, watch for:

repeated 429 responses on POST /api/schedules
bursts of limited outcomes in veil_api_rate_limit_decisions_total
repeated concurrency limiting on registration requests
memory-fallback backend events, which indicate Redis is unavailable

In dry-run mode, the coordinator logs what would have been limited without blocking the request.

Troubleshooting

Coordinator not executing schedules

check logs for execution-stage failures
verify database connectivity
verify ER authority keypair loading
verify Solana RPC connectivity

`/api/metrics` returns `403`

confirm METRICS_PUBLIC=false is intentional
send Authorization: Bearer <METRICS_AUTH_TOKEN>
verify METRICS_AUTH_TOKEN is set on the deployed service

Database errors

check DATABASE_URL
verify PostgreSQL is reachable
run migrations with npm run db:migrate

Redis limiter errors

confirm RATE_LIMIT_REDIS_URL is valid
expect one slower connect if your managed Redis instance was sleeping
if Redis is unavailable, registration falls back to in-memory limiting and read routes fail open

API Endpoints Instructions