AI Agent Observability & Governance

Observe, evaluate,
and govern AI agents

Capture every thought, action, and LLM call. Enforce quality policies, detect safety violations, and audit your entire agent fleet —
all in real-time, no cloud required.

4 800+traces captured today
6AI advisors
4ingestion paths
<10 msadvisory latency
agenttrace run research_agent.py
Works with
VS Code Copilot Claude CLI Python SDK OpenTelemetry FastAPI PostgreSQL
01

Capture everything
your agent does

From first thought to final output, AgentTrace records every span with full fidelity. Never wonder what your agent actually did again.

  • Thoughts and reasoning chains, including thinking_text
  • LLM calls — model, token count, cost, latency
  • Tool invocations with full input / output payloads
  • Agent handoffs and sub-agent delegation
  • Error recovery, retries, and quality scores
Trace DAG ● live
💭 reasoning 412 ms
🤖 llm_call gpt-4o · 847 ms · $0.003
🔧 tool_call / search 382 ms
🤖 llm_call gpt-4o · 1.2 s · $0.005
action / respond success
Advisory Board 2 critical
Performance Sentinel
ResearchAgent runs 6.3× slower than fleet average.
↗ View trace
CRITICAL
🧠
Prompt Strategist
Uncertainty phrases in 34% of reasoning spans.
↗ View trace
WARNING
💰
Cost Guardian
Token efficiency within normal range.
INFO
02

AI advisory.
No LLM required.

Six specialist advisors analyze your traces in milliseconds using pure statistical heuristics — no API calls, no latency, no external dependencies.

⚡ Performance Sentinel 💰 Cost Guardian 🔬 Quality Auditor 🛡️ Security Watchdog 🔧 Reliability Engineer 🧠 Prompt Strategist

Every finding links directly to the offending trace and span. P99 advisory latency <10 ms.

03

Zero setup.
Works with Copilot.

Add one block to your VS Code MCP config and every Copilot Agent session is automatically traced — no code changes in your agent needed.

.vscode/mcp.json
{
  "servers": {
    "agenttrace": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "python",
               "enterprise/mcp/server.py"]
    }
  }
}
  • 8 MCP tools — thoughts, actions, handoffs, errors
  • Works with VS Code Copilot and Claude CLI
  • Python SDK for custom instrumentation
  • @trace_span decorator for zero-boilerplate wrapping
Session · Document Analysis 4 traces
Document Planner
3 spans · 1.2 s · $0.004
4.5
Research Agent
8 spans · 4.8 s · $0.012
4.1
Synthesis Agent
5 spans · 2.1 s · $0.007
3.2
Output Formatter
2 spans · 0.8 s · $0.002
4.8
Metrics
Cost Spans Quality
Mon
Tue
Wed
Thu
Fri
Sat
Sun
$0.847
Total cost
4.1
Avg quality
142
Spans
04

Metrics and trends
across your fleet

Track cost, quality, tokens, and latency over time. Spot regressions before they become incidents. Understand which agents deliver value and which don't.

  • Time-series bar charts for cost, spans, tokens, latency
  • Cross-trace session analysis and timeline
  • Per-model token and cost breakdown
  • Quality trends — only meaningful spans counted

From monitoring to control

The capabilities that turn agent observability into a compliance-ready governance framework

🔐

Role-Based Access Control

Plan-gated capability matrix (Community / Pro / Enterprise) enforced on every API call. Org-scoped data isolation prevents cross-tenant leakage.

Available now
📋

Complete Audit Logging

Every API invocation — read, write, delete, login, org switch — is persisted with actor identity, timestamp, resource, and outcome. Admin-only viewer with cursor-paginated export.

Available now Enterprise
⚖️

Automated Quality & Safety Scoring

Every span is evaluated by an SLM-as-a-Judge against a rubric of conciseness, accuracy, and safety. Violations — hallucinations, PII leaks, policy breaches — are flagged immediately with scores and evidence.

Available now Enterprise
🎚️

Configurable Advisory Policies

Per-org and per-project threshold tuning for every advisor: latency multipliers, cost concentration limits, quality score floors, error rate ceilings. Changes take effect instantly without redeployment.

Available now Enterprise
💬

Human Feedback Integration

Operators can attach thumbs-up/down ratings and free-text comments to any trace. Feedback is persisted alongside automated quality scores, creating a labelled dataset for model improvement.

Available now
🛡️

Security Watchdog Advisor

One of six automated advisors exclusively focused on safety posture: detects prohibited-content patterns, data-leak indicators, and safety-policy violations across the entire trace corpus.

Available now
🚦

Policy Enforcement & Circuit Breakers

Block or halt agents in real-time when a policy threshold is breached — not just alert after the fact. Configurable enforcement modes: warn, throttle, or reject.

Roadmap
🙋

Human-in-the-Loop Approval Gates

Define checkpoints where an agent must pause and await human approval before proceeding. Approval decisions are audit-logged with identity and rationale.

Roadmap
🔔

Real-Time Alert Integrations

Push safety violations and policy breaches to Slack, PagerDuty, or any webhook endpoint the moment they fire — not on the next dashboard refresh.

Roadmap
🏢

SSO & Enterprise IdP

SAML 2.0 and OIDC integration with your corporate identity provider. Groups-to-roles mapping, JIT provisioning, and automated de-provisioning.

Roadmap

Up and running in minutes

Three steps from installation to production insights

01

Install

Clone the repo and start the FastAPI backend with one command. No cloud account or API keys required.

git clone … AgentTrace
uv run uvicorn backend.main:app
npm run dev
02

Instrument

Add the MCP server to your editor config for zero-code tracing, or use the Python SDK for custom control.

agenttrace run my_agent.py
# or configure .vscode/mcp.json
03

Analyze

Open the dashboard. View DAGs, timelines, advisory findings, and metrics — live, with no page refresh needed.

open http://localhost:5173
# Dashboard + Advisory Board

Simple, transparent tiers

Self-host for free or unlock advanced analytics for your team

Community
Free
Self-hosted, forever free
  • Unlimited traces & spans
  • Execution log
  • Metrics summary
  • Python SDK + MCP server
  • OpenTelemetry export
  • Trace & session visualization
  • Advisory board
  • InsightBot chat
  • Audit logs
Get Started
Enterprise
Custom
Contact us for pricing
  • Everything in Pro
  • Advisory policy tuning
  • Complete audit logging
  • Multi-tenant deployment
  • SLM judge integration
  • Causal discovery graph (Neo4j)
  • Priority support + custom SLA
Contact Sales

Start in under 2 minutes

.vscode/mcp.json
{
  "servers": {
    "agenttrace": {
      "type":    "stdio",
      "command": "uv",
      "args":   ["run", "python",
                "enterprise/mcp/server.py"]
    }
  }
}

Restart VS Code — Copilot tracks every agent session automatically. Open http://localhost:5173 to view the live dashboard.

Python
from agenttrace import start_trace, start_span, end_span, end_trace

# Wrap one agent run as a trace
trace = start_trace("research-agent", session_name="Q&A Session")

# Record a span
span = start_span(trace, "reasoning",
                  agent="planner", span_type="thought")

# ... your agent code ...

end_span(span, status="success", result_summary="Plan created")
end_trace(trace)

Or use @trace_span decorator for zero-boilerplate wrapping of any function.

Shell
# Zero code changes — wrap any script
agenttrace run my_agent.py

# With custom session grouping
AGENTTRACE_SESSION_NAME="nightly" \
  agenttrace run pipeline.py "Summarise Q1 results"

# Sample output
→ Trace started: t_3fa91c2b
→ Auto-instrumented: openai, anthropic, litellm
→ 7 spans flushed  ·  $0.014  ·  quality 4.1/5

Monkey-patches OpenAI, Anthropic, and LiteLLM at process start — no changes to your agent code.

Start observing your agents today

Open source · Self-hosted · Production-ready