Testing Dashboard — Overview

The Testing Dashboard (Surface 01) is the internal pre-release validation environment for the GOVERN platform, deployed at jarvis-dashboard-v6.pages.dev.

This is where all GOVERN surfaces are tested before they reach customers. The promotion path is:

Testing Dashboard (jarvis-dashboard-v6.pages.dev)
  → finalized testing
    → Internal Dashboard (Surface 02)
      → finalized for customer
        → Customer SOC (Surface 04)

Every SOC-style dashboard is first built and validated here, then promoted to the Internal Dashboard after testing passes, then copied to the Customer SOC surface for customer-facing deployment.

Purpose

GOVERN is a governance platform — it certifies AI systems for customers. That means our own quality bar must be higher than theirs. The Testing Dashboard enforces this by requiring every component to pass a full validation stack before release.

The dashboard answers three questions:

Does it work? Unit tests, integration tests, and e2e flows confirm functional correctness.
Does it look right? Playwright visual regression confirms UI components haven’t regressed.
Is it world-class? Gate II (V(Q) >= 85%) and Gate IV (5-point polish check) confirm quality meets the Home Standard.

Scope — The 10 GOVERN Components

Every GOVERN surface gets tested before release:

#	Component	Test Focus
01	Testing Dashboard (this surface)	Meta — tests the test infrastructure itself
02	Internal Dashboard	Build event wiring, pipeline integrity, cost accuracy
03	Customer Dashboard	Assessment flows, report generation, benchmark display
04	GOVERN API	Route coverage, auth, response shapes, rate limits
05	Monitoring Probe (Docker)	Container build, agent detection, telemetry emit
06	Browser Extension	Content script injection, overlay rendering, data capture
07	OS Agent (Desktop)	Process monitoring, system event capture, telemetry
08	Mobile App	iOS/Android flows, offline behavior, sync
09	Developer SDK	Integration flows, TypeScript types, example apps
10	GOVERN Docs	Link integrity, content accuracy, search index

Testing Philosophy

The Testing Dashboard follows the GOVERN convergence doctrine: no deploy without convergence, no convergence without evidence.

This means:

Tests are not optional pre-commit gate theater — they are the evidence that convergence has been reached
Every test failure is a blocked deploy, not a warning
Visual regression captures what metrics cannot — that the product looks right, not just that it ran
Probe tests run in Docker to match production deployment conditions exactly

Quality Gates

Two constitutional quality gates govern every release:

Gate II — V(Q) >= 85% The convergence score across all test dimensions must be 85% or higher. A score below 85% means the component is not ready to ship. No exceptions.

Gate IV — 5-Point Polish Check Before any UI surface ships, all five points must pass:

Scene moves at idle (ambient animation active)
Orb is the interface, not a textarea
Page matches the reference benchmark (archetypal-app.pages.dev)
Energy layers L1/L2/L3 visible
Text is ambient, not dominant

Any point failing means the UI is blocked until it passes.

Test Execution Flow

Engineer pushes change
       ↓
Typecheck (pnpm typecheck) — must return 0 errors
       ↓
Unit tests (pnpm test) — all suites must pass
       ↓
Integration tests — API endpoints, database queries
       ↓
Playwright visual regression — screenshots compared
       ↓
API spot-checks — curl against staging endpoints
       ↓
Probe container test — Docker build + local proxy
       ↓
QA score calculated — must be >= 85%
       ↓
Polish check — 5-point UI checklist
       ↓
Gate opens → component eligible for release

Dashboard Layout

The Testing Dashboard UI shows:

Test run status — current pass/fail for each component
Coverage heatmap — which areas have the most test coverage
Visual diff viewer — side-by-side Playwright screenshot comparison
Gate status panel — Gate II score + Gate IV checklist per component
Recent test history — last 10 runs per component with trends
Failure detail drawer — full output, stack traces, and suggested fixes

Relationship to CI/CD

The Testing Dashboard is not a replacement for CI/CD — it is the quality layer that gates deployment. The flow is:

CI runs on every PR (GitHub Actions) — fast, subset of tests
Testing Dashboard runs the full stack before release approval
Release is blocked until Testing Dashboard shows green across all gates

The dashboard reads test results from CI artifacts and enriches them with visual regression, probe tests, and quality scoring that CI does not produce.