Probe Testing

The GOVERN monitoring probe is a Docker container deployed alongside customer AI systems. It intercepts inference calls, captures telemetry, and emits monitoring events to the GOVERN API. Testing the probe requires a Docker environment and a local proxy setup.

What the Probe Does

The probe container runs as a sidecar or transparent proxy alongside the customer’s AI system:

Intercept — All HTTP calls to AI provider APIs (OpenAI, Anthropic, Groq, etc.) pass through the probe’s proxy
Capture — Request/response data is captured: model used, tokens consumed, latency, content flags
Assess — Local policy rules evaluate the captured data against the customer’s governance framework
Emit — Monitoring events are sent to the GOVERN API for accumulation and rollup
Report — Inline response headers carry governance metadata back to the calling application

Docker Build Test

Before any probe release, verify the Docker image builds cleanly:

cd packages/govern-probe

# Build the production image
docker build -t govern-probe:test .

# Verify the image is under 500MB (our size budget)
docker images govern-probe:test --format "{{.Size}}"

# Run the container in test mode
docker run --rm \
  -e GOVERN_API_URL=http://host.docker.internal:8787 \
  -e GOVERN_API_KEY=test-key-local \
  -e PROBE_MODE=test \
  -p 8080:8080 \
  govern-probe:test

Expected output on startup:

[GOVERN Probe] v0.x.x starting
[GOVERN Probe] Proxy listening on :8080
[GOVERN Probe] API endpoint: http://host.docker.internal:8787
[GOVERN Probe] Mode: test
[GOVERN Probe] Ready

Local Proxy Test

With the probe container running, verify it intercepts and forwards requests correctly:

Configure your test client

# Export proxy settings for your test session
export HTTPS_PROXY=http://localhost:8080
export HTTP_PROXY=http://localhost:8080

# If testing against Anthropic
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic

Send a test inference call

curl -s -X POST "http://localhost:8080/anthropic/v1/messages" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5-20251001",
    "max_tokens": 10,
    "messages": [{ "role": "user", "content": "Say hello" }]
  }' | jq .

Expected behavior:

The probe intercepts the request
It forwards to the real Anthropic API (or a stub in test mode)

The response is returned to the caller with added governance headers:

X-GOVERN-Assessment: pass
X-GOVERN-EventId: evt_abc123
X-GOVERN-PolicyFlags: []

A monitoring event is emitted to the GOVERN API

Verify telemetry emission

Check that the probe emitted an event to the GOVERN API:

# Query the local API for the emitted event
curl -s "http://localhost:8787/api/monitoring/recent" \
  -H "Authorization: Bearer test-secret" | jq '.data.events[0]'

Expected event shape:

{
  "id": "evt_abc123",
  "systemId": "probe-test",
  "eventType": "inference",
  "provider": "anthropic",
  "model": "claude-haiku-4-5-20251001",
  "inputTokens": 12,
  "outputTokens": 10,
  "latencyMs": 342,
  "policyResult": "pass",
  "timestamp": "2026-04-12T..."
}

Telemetry Verification Checklist

For each probe release, verify these telemetry properties are correctly captured:

Accuracy checks

Model name matches what was requested (no normalization drift)
Token counts match provider response (input + output separately)
Latency is measured end-to-end (probe receipt → provider response)
Timestamp is in UTC ISO-8601 format
System ID comes from the GOVERN_SYSTEM_ID env var, not defaulted

Policy evaluation checks

policyResult is pass, flag, or block
policyFlags is an array (empty array [] when no flags, not null)
Block decisions result in the original request being rejected (HTTP 451)
Flag decisions pass through with the governance header set

Emission reliability checks

Events are emitted within 100ms of the proxied response returning
Failed emissions are retried (check probe logs for retry behavior)
After 3 failed retries, events are written to local buffer file
Probe does not block the proxy response waiting for emission acknowledgment

Multi-Provider Testing

The probe supports multiple AI providers. Test each before release:

# OpenAI test
curl -s -X POST "http://localhost:8080/openai/v1/chat/completions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hello"}],"max_tokens":5}'

# Groq test
curl -s -X POST "http://localhost:8080/groq/openai/v1/chat/completions" \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"llama3-8b-8192","messages":[{"role":"user","content":"hello"}],"max_tokens":5}'

Container Size and Performance Budget

Metric	Budget	How to measure
Image size	< 500MB	`docker images govern-probe:test --format "{{.Size}}"`
Startup time	< 3 seconds	`time docker run --rm govern-probe:test --dry-run`
Proxy overhead	< 20ms P95	Run 100 requests, measure latency added vs direct
Memory at idle	< 128MB	`docker stats govern-probe:test --no-stream`

Probe Test Automation

# Full probe test suite (requires Docker)
cd packages/govern-probe
pnpm test:probe

# This runs:
# 1. Docker build verification
# 2. Container startup check
# 3. Proxy interception tests (10 providers)
# 4. Telemetry emission verification
# 5. Policy evaluation accuracy tests
# 6. Performance budget checks

Common Probe Failures

Symptom	Cause	Fix
Container exits immediately	Missing env vars	Check `GOVERN_API_URL` and `GOVERN_API_KEY` are set
Proxy returns 502	Upstream provider unreachable	Check network, try stub mode
No events in GOVERN API	Emission failing silently	Check probe logs: `docker logs <container_id>`
Policy result always “pass”	Policy rules not loaded	Verify `GOVERN_POLICY_FILE` points to valid rules file
Token counts wrong	Provider response parsing error	Check probe version matches provider API version