Skip to content

Cost Governor

The Cost Governor tracks every token spent by the GOVERN platform — by model, by archetype, by session, and by customer org. It enforces budget thresholds and triggers alerts before spend becomes a problem.

What Gets Tracked

Platform infrastructure spend (Archetypal AI’s cost)

  • Claude Sonnet 4.6 calls from the API gateway council deliberations
  • Claude Haiku 4.5 calls from the Autonomy Kernel heartbeats
  • OpenAI TTS-1 calls for archetype voice synthesis
  • Workers AI Llama 8B/70B calls from agent deliberations
  • Supabase database read/write costs
  • Cloudflare R2 storage costs
  • Upstash Redis request counts

Customer AI spend (tracked for governance, not billed by us)

  • Monitoring events captured from customer AI systems
  • Token counts reported via the probe container or SDK
  • Cost estimates based on provider pricing tables

The Cost Governor’s Internal Dashboard shows platform infrastructure spend. Customer AI spend appears in the customer-facing Reporting surface.

Cost Tracking Architecture

[API call made]
[AI provider response includes token counts]
[api-gateway extracts: model, input_tokens, output_tokens]
[POST /api/monitoring/emit with cost payload]
[Monitoring pipeline accumulates + rolls up]
[token_usage table updated]
[Cost Governor queries + alerts]

Token cost table

const TOKEN_COSTS_USD_PER_1M = {
'claude-sonnet-4-20250514': { input: 3.00, output: 15.00 },
'claude-haiku-4-5-20251001': { input: 0.80, output: 4.00 },
'gpt-4o': { input: 5.00, output: 15.00 },
'gpt-4o-mini': { input: 0.15, output: 0.60 },
'llama-3.1-8b-instruct': { input: 0.01, output: 0.01 },
'llama-3.1-70b-instruct': { input: 0.05, output: 0.05 },
'tts-1': { characters: 0.015 }, // per 1000 characters
};

Budget Thresholds

Platform infrastructure budgets

CategoryDaily budgetMonthly budgetAlert at
Council deliberations (Claude Sonnet)$10$20080%
Autonomy kernel (Claude Haiku)$2$5080%
TTS synthesis (OpenAI)$1$2090%
Workers AI (all models)$1$2080%
Total platform$15$30080%

Alert levels

LevelConditionAction
INFO50% of daily budgetLog only
WARNING80% of daily budgetSlack notification
CRITICAL95% of daily budgetPagerDuty + auto-throttle
HARD STOP100% of daily budgetReject new requests for the period

Cost Governor API

expressiveCode.terminalWindowFallbackTitle
# Current period spend
curl "$JARVIS_API_URL/api/monitoring/cost/current" \
-H "Authorization: Bearer $AUTH_SECRET" | jq .
# Response:
# {
# "period": "2026-04-12",
# "totalUsd": 4.23,
# "budgetUsd": 15.00,
# "utilizationPct": 28.2,
# "breakdown": {
# "claude-sonnet-4-20250514": { "usd": 3.12, "inputTokens": 482000, "outputTokens": 127000 },
# "claude-haiku-4-5-20251001": { "usd": 0.89, "inputTokens": 1240000, "outputTokens": 147000 },
# "tts-1": { "usd": 0.22, "characters": 14700 }
# }
# }
# Historical spend (last 30 days)
curl "$JARVIS_API_URL/api/monitoring/cost/history?days=30" \
-H "Authorization: Bearer $AUTH_SECRET" | jq .
# Budget alert status
curl "$JARVIS_API_URL/api/monitoring/cost/alerts" \
-H "Authorization: Bearer $AUTH_SECRET" | jq .

Internal Dashboard — Cost Panel

The Cost Governor panel in the Internal Dashboard shows:

Today’s spend meter — Circular gauge showing current daily spend vs budget. Color: green (< 50%), yellow (50–80%), orange (80–95%), red (> 95%).

Spend by model — Bar chart showing token counts and USD cost per model for the current period.

Spend trend (30 days) — Line chart showing daily spend over the last 30 days with budget line overlay.

Budget utilization — Table showing each category, current spend, budget, and utilization percentage.

Cost anomaly detector — Flags any day where spend is > 2x the 7-day rolling average. This catches runaway loops, infinite retries, or unexpectedly expensive prompts.

Cost Optimization Insights

The Cost Governor also surfaces optimization opportunities:

Haiku vs Sonnet routing opportunities

Calls that used Sonnet but could have used Haiku (short, non-reasoning tasks):

SELECT
description,
model,
input_tokens,
output_tokens,
total_cost_usd,
(input_tokens * 0.003 + output_tokens * 0.015) -
(input_tokens * 0.0008 + output_tokens * 0.004) AS potential_savings_usd
FROM token_usage
WHERE model = 'claude-sonnet-4-20250514'
AND output_tokens < 100 -- Short outputs might not need Sonnet
AND created_at > NOW() - INTERVAL '7 days'
ORDER BY total_cost_usd DESC
LIMIT 20;

Council deliberation cost breakdown

How much does each council deliberation cost vs the build event quality it produces?

SELECT
be.description,
SUM(tu.total_cost_usd) AS deliberation_cost,
be.quality,
be.quality / NULLIF(SUM(tu.total_cost_usd), 0) AS quality_per_dollar
FROM build_events be
JOIN token_usage tu ON tu.session_id = be.metadata->>'sessionId'
WHERE be.type = 'deliberation'
AND be.created_at > NOW() - INTERVAL '30 days'
GROUP BY be.id, be.description, be.quality
ORDER BY quality_per_dollar DESC;

Database Schema

CREATE TABLE token_usage (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES profiles(id),
org_id UUID REFERENCES orgs(id),
session_id TEXT,
model TEXT NOT NULL,
provider TEXT NOT NULL,
input_tokens BIGINT DEFAULT 0,
output_tokens BIGINT DEFAULT 0,
total_tokens BIGINT GENERATED ALWAYS AS (input_tokens + output_tokens) STORED,
total_cost_usd NUMERIC(10,6),
description TEXT,
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_token_usage_user ON token_usage(user_id, created_at DESC);
CREATE INDEX idx_token_usage_model ON token_usage(model, created_at DESC);
CREATE INDEX idx_token_usage_org ON token_usage(org_id, created_at DESC);