Skip to content

Operations Monitoring

GOVERN uses a combination of Cloudflare Analytics, Langfuse, and Supabase observability to monitor the platform.

Primary observability surfaces

ToolWhat it monitors
Cloudflare Workers AnalyticsAPI latency, error rates, CPU time, request volume
Cloudflare Pages AnalyticsWeb app traffic, error rates
Supabase DashboardDatabase query performance, connection pool, disk usage
Upstash Redis DashboardCache hit rate, connection count, memory usage
LangfuseAI call traces, model latency, token usage

Key metrics to watch

MetricWarningCritical
API P95 latency> 300ms> 1000ms
Error rate> 0.5%> 2%
Assessment throughputDrop > 20%Drop > 50%
DB connection pool> 70% used> 90% used
Redis memory> 70% used> 85% used
Disk usage (Supabase)> 70%> 85%

Alerting

Alerts are routed via PagerDuty for critical and high severity incidents. Medium and low go to the #govern-ops Slack channel.

On-call rotation is managed in PagerDuty. The current on-call schedule is visible in the PagerDuty portal.

Synthetic monitoring

Synthetic health checks run every 60 seconds from three regions (US, EU, APAC):

GET https://govern-api.archetypal.ai/health
GET https://govern-dashboard.pages.dev
GET https://govern-docs.pages.dev

Failure triggers a PagerDuty incident immediately.