Agent Handoffs
In a multi-agent system, agents rarely operate in complete isolation. At some point, one agent's work becomes another agent's starting point. This transition — when control and context move from one agent to another — is called a handoff. Handoffs seem simple in theory but are one of the most error-prone moments in any orchestration system. A handoff that loses context produces a downstream agent that starts from an incomplete picture. A handoff that transfers too much irrelevant information overloads the receiving agent's context window. Getting handoffs right is a core engineering discipline in multi-agent design.
What a Handoff Transfers
A handoff is not just passing a file or a text string. It is a carefully structured transfer of three distinct things: the goal (what the receiving agent must accomplish), the state (what has been learned or produced so far that the receiving agent needs to know), and the constraints (time limits, formatting requirements, scope boundaries, or other rules the receiving agent must respect). Consider a customer service pipeline. Agent A is a triage agent that reads an incoming support ticket, classifies the issue type, and extracts key facts. When it hands off to Agent B, a specialist resolution agent, the handoff must include: the classification result, the extracted facts, the customer's account history if relevant, and any constraints the resolution agent must respect (e.g., only offer refunds up to a certain value). If the handoff omits the extracted facts, Agent B must re-read the raw ticket — wasting context and time, and risking different conclusions. If it omits constraints, Agent B might offer a resolution that violates policy.
Think of a handoff as a contract: the sending agent promises to deliver a specific payload in a specific format, and the receiving agent is designed to consume exactly that payload. Contracts must be documented and tested like any API.
Handoff Formats: Structured vs. Natural Language
Handoffs can use natural language summaries or structured data formats — JSON, YAML, XML, or domain-specific schemas. Each has tradeoffs. Natural language summaries are easy for the receiving agent to incorporate into its reasoning context. They read like briefings from a colleague. But they are ambiguous, hard to parse programmatically, and can silently omit critical information if the sending agent's summarization is poor. Structured formats are precise, machine-readable, and easy to validate with a schema. A receiving agent (or the orchestration layer itself) can verify that required fields are present before the agent starts work. Structured formats also enable logging and auditing — you can inspect every handoff message in a system trace and know exactly what information each agent received. The tradeoff is that LLM-based agents must be explicitly instructed to produce and consume structured formats, and the system prompt overhead increases. Best practice in production systems is structured handoffs with a natural language 'summary' field: the machine parses the structured fields for routing and validation, and the receiving agent reads the summary field for readable context.
Match each handoff design choice to the specific problem it addresses.
Terms
Definitions
Drag terms onto their definitions, or click a term then click a definition to match.
Failure Modes in Handoffs
The most common handoff failure is context loss: information that the sending agent has but considers obvious is omitted from the handoff, and the receiving agent proceeds on an incomplete picture. This is analogous to a surgeon handing off to a recovery nurse without communicating which medications the patient received during the procedure. The second failure mode is context contamination: the receiving agent is given too much irrelevant history, and its reasoning drifts toward that history rather than the current goal. This is especially pernicious in LLM-based agents, where long context windows can cause the model to anchor on early information even when later information should take precedence. The third failure mode is format mismatch: the sending agent produces output in a format that the receiving agent or the orchestration layer cannot parse. This causes the downstream agent to fail immediately, often in ways that are difficult to debug if the orchestration system does not log the raw handoff payload. A fourth failure mode is goal drift: the handoff correctly transfers state but describes the goal ambiguously, and the receiving agent pursues a subtly different goal than intended. Explicit, unambiguous goal statements — preferably phrased as acceptance criteria rather than open-ended instructions — are the mitigation.
A handoff that fails loudly (raises an exception, fails schema validation) is easy to fix. A handoff that silently loses one key fact and allows the downstream agent to proceed — confidently, producing plausible-looking but wrong output — can propagate errors through an entire pipeline before anyone notices.
Flashcards — click each card to reveal the answer
An orchestration pipeline passes a 12,000-word customer interaction history to every downstream agent, regardless of what each agent needs. Which handoff failure mode is this an example of?
A sending agent summarizes a user's complaint as 'The user is unhappy about their order.' The receiving resolution agent interprets this as a shipping delay and offers a tracking update. The actual issue was a billing error. Which failure mode caused this?
Design a Handoff Contract
- You are building an automated medical appointment scheduling system with two agents: a Triage Agent (classifies urgency and specialty needed from a patient's description of symptoms) and a Scheduling Agent (books the appointment in the calendar system).
- Step 1: List every piece of information the Scheduling Agent needs to do its job correctly.
- Step 2: Design the handoff payload as a JSON schema — name every field, give it a type, and mark it as required or optional.
- Step 3: Add a 'constraints' section to the schema that conveys any rules the Scheduling Agent must respect (e.g., urgency-based time limits).
- Step 4: Write a natural language 'summary' field that would give the Scheduling Agent a readable briefing.
- Step 5: Identify two facts the Triage Agent might consider obvious and accidentally omit. What would go wrong downstream?
- Present your schema to a peer and have them try to 'break' it by identifying a scenario your schema does not handle.