Observe · Platform

Real-Time Tracing

Every model call, tool call, and custom span — as it happens, not reconstructed afterward.

Prefactor captures every agent run as structured spans in real time — model calls, tool calls, and the custom spans you define for your own domain — and shows you the conversation as it unfolds, not a log you piece together after the fact.

Live trace — support-agent-v4 Illustrative
model.chat gpt-4o 142ms
tool.search_kb custom span 89ms
tool.refund_api custom span 412ms
TL;DR

Every agent run becomes an instance made of spans — one per LLM call, tool call, or custom operation you define — captured in real time with inputs, outputs, duration, and outcome. Spans are the raw material every other Prefactor capability reads, from cost to drift.

What gets captured, in real time

Every model call and tool call an agent makes becomes a span the moment it happens — not batched, not sampled after the fact. On top of that, you can define custom spans for whatever matters in your own domain: a retrieval step, a validation check, a specific tool your agent calls that's worth tracking on its own. Custom spans are what let quality assessment measure drift on the parts of a run that are actually specific to your product, not just generic model-call metrics.

Getting spans flowing takes a CLI install or an SDK integration — the TypeScript or Python core SDK, or native support for the frameworks Prefactor integrates with — and most teams are sending real-time data within the same session.

Watching a run as it happens

The conversational view shows a run unfolding live — what the agent said, which tools it called and with what arguments, what came back — the same way you'd watch a conversation, not the way you'd reconstruct one from a log file. Alongside it, p95 and p99 are tracked per agent and per span type — on latency, but also on cost, risk, and quality metrics — so a slow tool call, a costlier-than-usual run, or a quietly worsening tail of low-quality responses all show up as a number, not a vague sense that things feel off today.

This is the real-time half of the platform: observe isn't a dashboard you check at the end of the day, it's what's happening in production right now.

Comparing versions and environments

Every span carries the agent's version and environment, from the registry. That's what makes comparison possible: the same task's spans from last week's version against this week's, or staging against production, line up against each other directly. Drift shows up as a difference between those spans — a tool call that used to take 200ms now taking 900ms, a custom span's output shape changing after a prompt update — before it shows up as a quality score dropping or a support ticket arriving.

That comparison is the raw material quality assessment (Evaluate) turns into a drift trend, and the audit trail is where every span still lives after the fact for search and export.

Frequently asked questions

What is a span?
A span is the atomic record of a single step inside an agent run — one discrete unit of work, such as an LLM call, a tool invocation, or a message. Each span records its input, output, duration, and whether it succeeded or failed.
What's the difference between a span and a custom span?
Span types are defined by an agent's activity schema, which the team building the agent controls. A custom span is one you define for a domain-specific operation — a retrieval step, a validation check — rather than using a generic catch-all type, so it can be validated and analysed on its own.
How do I start sending traces to Prefactor?
Install the SDK for your framework — the TypeScript or Python core SDK, or a native integration package like @prefactor/langchain or prefactor-langchain — initialise it with an API token, and spans start recording. See docs.prefactor.ai/sdks for the exact install commands.

Drop this into what you already run

TypeScript and Python SDKs, plus OpenTelemetry ingest — native for LangChain, Claude, Vercel AI, OpenClaw and LiveKit, with 15 framework integrations covered out of the box.

terminal
$ prefactor init

See it on your own agents

Book a demo and we'll walk through real-time tracing on a fleet like yours — real frameworks, real traces.

Agent Performance Platform
Unified performance platform for agents, authentication, and risk management
All Systems Operational
3Global Agents
7Instances
5Services
12%Human Intervene
4High Risk
$2,360Monthly Spend
Mission ControlLive agent health with 7-day activity heartbeat
Claims Proc...68
$330/moRed
Claims Proc...65
$160/moRed
Claims Proc...82
$170/moAmber
ChatGPT74
$150/moAmber

See how every agent performs — and make it better

Prefactor helps teams observe, evaluate, and improve their AI agents in production — across every framework and provider.