Cost Governor

The Cost Governor tracks every token spent by the GOVERN platform — by model, by archetype, by session, and by customer org. It enforces budget thresholds and triggers alerts before spend becomes a problem.

What Gets Tracked

Platform infrastructure spend (Archetypal AI’s cost)

Claude Sonnet 4.6 calls from the API gateway council deliberations
Claude Haiku 4.5 calls from the Autonomy Kernel heartbeats
OpenAI TTS-1 calls for archetype voice synthesis
Workers AI Llama 8B/70B calls from agent deliberations
Supabase database read/write costs
Cloudflare R2 storage costs
Upstash Redis request counts

Customer AI spend (tracked for governance, not billed by us)

Monitoring events captured from customer AI systems
Token counts reported via the probe container or SDK
Cost estimates based on provider pricing tables

The Cost Governor’s Internal Dashboard shows platform infrastructure spend. Customer AI spend appears in the customer-facing Reporting surface.

Cost Tracking Architecture

[API call made]
     ↓
[AI provider response includes token counts]
     ↓
[api-gateway extracts: model, input_tokens, output_tokens]
     ↓
[POST /api/monitoring/emit with cost payload]
     ↓
[Monitoring pipeline accumulates + rolls up]
     ↓
[token_usage table updated]
     ↓
[Cost Governor queries + alerts]

Token cost table

const TOKEN_COSTS_USD_PER_1M = {
  'claude-sonnet-4-20250514': { input: 3.00, output: 15.00 },
  'claude-haiku-4-5-20251001': { input: 0.80, output: 4.00 },
  'gpt-4o': { input: 5.00, output: 15.00 },
  'gpt-4o-mini': { input: 0.15, output: 0.60 },
  'llama-3.1-8b-instruct': { input: 0.01, output: 0.01 },
  'llama-3.1-70b-instruct': { input: 0.05, output: 0.05 },
  'tts-1': { characters: 0.015 },  // per 1000 characters
};

Budget Thresholds

Platform infrastructure budgets

Category	Daily budget	Monthly budget	Alert at
Council deliberations (Claude Sonnet)	$10	$200	80%
Autonomy kernel (Claude Haiku)	$2	$50	80%
TTS synthesis (OpenAI)	$1	$20	90%
Workers AI (all models)	$1	$20	80%
Total platform	$15	$300	80%

Alert levels

Level	Condition	Action
INFO	50% of daily budget	Log only
WARNING	80% of daily budget	Slack notification
CRITICAL	95% of daily budget	PagerDuty + auto-throttle
HARD STOP	100% of daily budget	Reject new requests for the period

Cost Governor API

# Current period spend
curl "$JARVIS_API_URL/api/monitoring/cost/current" \
  -H "Authorization: Bearer $AUTH_SECRET" | jq .

# Response:
# {
#   "period": "2026-04-12",
#   "totalUsd": 4.23,
#   "budgetUsd": 15.00,
#   "utilizationPct": 28.2,
#   "breakdown": {
#     "claude-sonnet-4-20250514": { "usd": 3.12, "inputTokens": 482000, "outputTokens": 127000 },
#     "claude-haiku-4-5-20251001": { "usd": 0.89, "inputTokens": 1240000, "outputTokens": 147000 },
#     "tts-1": { "usd": 0.22, "characters": 14700 }
#   }
# }

# Historical spend (last 30 days)
curl "$JARVIS_API_URL/api/monitoring/cost/history?days=30" \
  -H "Authorization: Bearer $AUTH_SECRET" | jq .

# Budget alert status
curl "$JARVIS_API_URL/api/monitoring/cost/alerts" \
  -H "Authorization: Bearer $AUTH_SECRET" | jq .

Internal Dashboard — Cost Panel

The Cost Governor panel in the Internal Dashboard shows:

Today’s spend meter — Circular gauge showing current daily spend vs budget. Color: green (< 50%), yellow (50–80%), orange (80–95%), red (> 95%).

Spend by model — Bar chart showing token counts and USD cost per model for the current period.

Spend trend (30 days) — Line chart showing daily spend over the last 30 days with budget line overlay.

Budget utilization — Table showing each category, current spend, budget, and utilization percentage.

Cost anomaly detector — Flags any day where spend is > 2x the 7-day rolling average. This catches runaway loops, infinite retries, or unexpectedly expensive prompts.

Cost Optimization Insights

The Cost Governor also surfaces optimization opportunities:

Haiku vs Sonnet routing opportunities

Calls that used Sonnet but could have used Haiku (short, non-reasoning tasks):

SELECT
  description,
  model,
  input_tokens,
  output_tokens,
  total_cost_usd,
  (input_tokens * 0.003 + output_tokens * 0.015) -
  (input_tokens * 0.0008 + output_tokens * 0.004) AS potential_savings_usd
FROM token_usage
WHERE model = 'claude-sonnet-4-20250514'
  AND output_tokens < 100  -- Short outputs might not need Sonnet
  AND created_at > NOW() - INTERVAL '7 days'
ORDER BY total_cost_usd DESC
LIMIT 20;

Council deliberation cost breakdown

How much does each council deliberation cost vs the build event quality it produces?

SELECT
  be.description,
  SUM(tu.total_cost_usd) AS deliberation_cost,
  be.quality,
  be.quality / NULLIF(SUM(tu.total_cost_usd), 0) AS quality_per_dollar
FROM build_events be
JOIN token_usage tu ON tu.session_id = be.metadata->>'sessionId'
WHERE be.type = 'deliberation'
  AND be.created_at > NOW() - INTERVAL '30 days'
GROUP BY be.id, be.description, be.quality
ORDER BY quality_per_dollar DESC;

Database Schema

CREATE TABLE token_usage (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES profiles(id),
  org_id UUID REFERENCES orgs(id),
  session_id TEXT,
  model TEXT NOT NULL,
  provider TEXT NOT NULL,
  input_tokens BIGINT DEFAULT 0,
  output_tokens BIGINT DEFAULT 0,
  total_tokens BIGINT GENERATED ALWAYS AS (input_tokens + output_tokens) STORED,
  total_cost_usd NUMERIC(10,6),
  description TEXT,
  metadata JSONB DEFAULT '{}',
  created_at TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX idx_token_usage_user ON token_usage(user_id, created_at DESC);
CREATE INDEX idx_token_usage_model ON token_usage(model, created_at DESC);
CREATE INDEX idx_token_usage_org ON token_usage(org_id, created_at DESC);