Cost Governor
The Cost Governor tracks every token spent by the GOVERN platform — by model, by archetype, by session, and by customer org. It enforces budget thresholds and triggers alerts before spend becomes a problem.
What Gets Tracked
Platform infrastructure spend (Archetypal AI’s cost)
- Claude Sonnet 4.6 calls from the API gateway council deliberations
- Claude Haiku 4.5 calls from the Autonomy Kernel heartbeats
- OpenAI TTS-1 calls for archetype voice synthesis
- Workers AI Llama 8B/70B calls from agent deliberations
- Supabase database read/write costs
- Cloudflare R2 storage costs
- Upstash Redis request counts
Customer AI spend (tracked for governance, not billed by us)
- Monitoring events captured from customer AI systems
- Token counts reported via the probe container or SDK
- Cost estimates based on provider pricing tables
The Cost Governor’s Internal Dashboard shows platform infrastructure spend. Customer AI spend appears in the customer-facing Reporting surface.
Cost Tracking Architecture
[API call made] ↓[AI provider response includes token counts] ↓[api-gateway extracts: model, input_tokens, output_tokens] ↓[POST /api/monitoring/emit with cost payload] ↓[Monitoring pipeline accumulates + rolls up] ↓[token_usage table updated] ↓[Cost Governor queries + alerts]Token cost table
const TOKEN_COSTS_USD_PER_1M = { 'claude-sonnet-4-20250514': { input: 3.00, output: 15.00 }, 'claude-haiku-4-5-20251001': { input: 0.80, output: 4.00 }, 'gpt-4o': { input: 5.00, output: 15.00 }, 'gpt-4o-mini': { input: 0.15, output: 0.60 }, 'llama-3.1-8b-instruct': { input: 0.01, output: 0.01 }, 'llama-3.1-70b-instruct': { input: 0.05, output: 0.05 }, 'tts-1': { characters: 0.015 }, // per 1000 characters};Budget Thresholds
Platform infrastructure budgets
| Category | Daily budget | Monthly budget | Alert at |
|---|---|---|---|
| Council deliberations (Claude Sonnet) | $10 | $200 | 80% |
| Autonomy kernel (Claude Haiku) | $2 | $50 | 80% |
| TTS synthesis (OpenAI) | $1 | $20 | 90% |
| Workers AI (all models) | $1 | $20 | 80% |
| Total platform | $15 | $300 | 80% |
Alert levels
| Level | Condition | Action |
|---|---|---|
| INFO | 50% of daily budget | Log only |
| WARNING | 80% of daily budget | Slack notification |
| CRITICAL | 95% of daily budget | PagerDuty + auto-throttle |
| HARD STOP | 100% of daily budget | Reject new requests for the period |
Cost Governor API
# Current period spendcurl "$JARVIS_API_URL/api/monitoring/cost/current" \ -H "Authorization: Bearer $AUTH_SECRET" | jq .
# Response:# {# "period": "2026-04-12",# "totalUsd": 4.23,# "budgetUsd": 15.00,# "utilizationPct": 28.2,# "breakdown": {# "claude-sonnet-4-20250514": { "usd": 3.12, "inputTokens": 482000, "outputTokens": 127000 },# "claude-haiku-4-5-20251001": { "usd": 0.89, "inputTokens": 1240000, "outputTokens": 147000 },# "tts-1": { "usd": 0.22, "characters": 14700 }# }# }
# Historical spend (last 30 days)curl "$JARVIS_API_URL/api/monitoring/cost/history?days=30" \ -H "Authorization: Bearer $AUTH_SECRET" | jq .
# Budget alert statuscurl "$JARVIS_API_URL/api/monitoring/cost/alerts" \ -H "Authorization: Bearer $AUTH_SECRET" | jq .Internal Dashboard — Cost Panel
The Cost Governor panel in the Internal Dashboard shows:
Today’s spend meter — Circular gauge showing current daily spend vs budget. Color: green (< 50%), yellow (50–80%), orange (80–95%), red (> 95%).
Spend by model — Bar chart showing token counts and USD cost per model for the current period.
Spend trend (30 days) — Line chart showing daily spend over the last 30 days with budget line overlay.
Budget utilization — Table showing each category, current spend, budget, and utilization percentage.
Cost anomaly detector — Flags any day where spend is > 2x the 7-day rolling average. This catches runaway loops, infinite retries, or unexpectedly expensive prompts.
Cost Optimization Insights
The Cost Governor also surfaces optimization opportunities:
Haiku vs Sonnet routing opportunities
Calls that used Sonnet but could have used Haiku (short, non-reasoning tasks):
SELECT description, model, input_tokens, output_tokens, total_cost_usd, (input_tokens * 0.003 + output_tokens * 0.015) - (input_tokens * 0.0008 + output_tokens * 0.004) AS potential_savings_usdFROM token_usageWHERE model = 'claude-sonnet-4-20250514' AND output_tokens < 100 -- Short outputs might not need Sonnet AND created_at > NOW() - INTERVAL '7 days'ORDER BY total_cost_usd DESCLIMIT 20;Council deliberation cost breakdown
How much does each council deliberation cost vs the build event quality it produces?
SELECT be.description, SUM(tu.total_cost_usd) AS deliberation_cost, be.quality, be.quality / NULLIF(SUM(tu.total_cost_usd), 0) AS quality_per_dollarFROM build_events beJOIN token_usage tu ON tu.session_id = be.metadata->>'sessionId'WHERE be.type = 'deliberation' AND be.created_at > NOW() - INTERVAL '30 days'GROUP BY be.id, be.description, be.qualityORDER BY quality_per_dollar DESC;Database Schema
CREATE TABLE token_usage ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES profiles(id), org_id UUID REFERENCES orgs(id), session_id TEXT, model TEXT NOT NULL, provider TEXT NOT NULL, input_tokens BIGINT DEFAULT 0, output_tokens BIGINT DEFAULT 0, total_tokens BIGINT GENERATED ALWAYS AS (input_tokens + output_tokens) STORED, total_cost_usd NUMERIC(10,6), description TEXT, metadata JSONB DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT now());
CREATE INDEX idx_token_usage_user ON token_usage(user_id, created_at DESC);CREATE INDEX idx_token_usage_model ON token_usage(model, created_at DESC);CREATE INDEX idx_token_usage_org ON token_usage(org_id, created_at DESC);