Support agents that know when to stop guessing
A conversation that stalls on a screenshot or an ambiguous step is a real-time judgement call, not a static FAQ.
Built on LangGraph — a real design-partner deployment, anonymised.
- Real-time conversation monitoring, not after-the-fact log review
- Human-in-the-loop handoff built into the span structure
- One conversation, one inspectable instance
When a support agent hits the limits of what it can reliably interpret — a screenshot, an ambiguous step, a user who's clearly stuck — Prefactor watches the conversation in real time and can trigger a human handoff instead of letting the agent guess. Every conversation is its own instance; custom spans tag exactly where it stalled, PII gets tagged and redacted as it's encountered, and a single risky run can be paused or killed without touching any other conversation.
The problem
A support agent walks users through a long setup flow — connecting an account, verifying a step, completing a multi-stage process. Some users get stuck partway through and send a screenshot instead of describing the problem in words. The agent can attempt to interpret it, but interpretation isn't certainty — and guessing wrong at a high-stakes step is worse than asking a human. There's a quality dimension underneath this too: the same question can get a different answer every time, and fixing one thing tends to quietly break another — which is exactly why a one-off benchmark score doesn't hold up, and why the real ask is closer to a SWE-bench-style pass/fail suite plus an LLM judge scoring the qualitative stuff a binary check can't.
How it works in Prefactor
Real-time conversation monitoring reads the conversation as it unfolds, watching for the pattern that signals a user is stuck rather than just asking a normal question — not logging what happened after the fact.
Human-in-the-loop is a first-class handoff, built directly into the span structure, not a support ticket thrown over a wall. When the stuck pattern is detected, Prefactor triggers the handoff as a designed part of the flow.
Every support conversation is its own inspectable instance — open exactly one interaction and see everything that happened in it, rather than digging through a shared log.
Custom spans capture qualitative detail: which step of the flow the user got stuck on, whether the handoff was timely — tagged as structured data on the conversation, not buried in a transcript.
Quality gets scored on the same conversation, not just risk: an LLM-as-judge pass checks whether the handoff actually resolved the issue, and a thumbs up/down from the human who took over attaches directly to that instance — so "did this actually work" has an answer, not just "was this risky."
Sensitive data in a screenshot or message gets tagged the moment it's encountered, and can be redacted automatically.
If something goes wrong, that specific run can be paused — a checkpoint, not a termination — or killed outright, triggered natively from the dashboard or programmatically via a custom span, without touching any other conversation the agent is handling.
Frequently asked questions
Does Prefactor replace our support agent's logic?
What counts as a "stuck" pattern worth escalating?
Can we see why a handoff happened after the fact?
Related glossary terms
See it on your own agents
Book a demo and we'll walk through support agents that know when to stop guessing on a fleet like yours — real frameworks, real traces.
Unified performance platform for agents, authentication, and risk management