Skip to main content
AI Agents & Automation

⏱ About 20 min20 XP

Why Agents Fail

A traditional software program runs a fixed sequence of instructions. When it breaks, the failure is local and usually reproducible: feed it the same input twice and it crashes the same way twice. An AI agent is fundamentally different. An agent perceives its environment, decides what to do, takes actions that change that environment, then perceives again — a closed feedback loop that can run indefinitely. That loop is what makes agents powerful, and it is exactly what makes them treacherous when things go wrong.

The Reliability Gap

A reliability gap is the distance between what a system is expected to do and what it actually does under real operating conditions. For a web server, that gap is usually small and well-understood: a server either responds or it does not, and error rates are measured in fractions of a percent. For an AI agent, the reliability gap is wide and hard to measure. The agent's behavior depends on the language model's predictions, which are probabilistic; on the tools it chooses to call, which may have side effects; and on the state of an environment that is constantly changing. Any one of these dimensions can introduce failure modes that traditional software engineering has no established vocabulary for.

The Stakes Are Different

When a web form fails, the user sees an error message. When an agent fails, it may have already sent emails, deleted files, charged a credit card, or posted to a public API. Agents act in the world before anyone can verify that their reasoning was sound. This is the core reliability problem.

Consider three agent deployments and what happens when each fails. A customer service agent misunderstands a refund request and issues a refund for the wrong order — a real financial transaction is already complete before a human notices. A code-writing agent incorrectly interprets a task and deletes a production database table while trying to 'clean up old records.' A research agent, given a long-running task, keeps calling a paid search API hundreds of times because its exit condition is never satisfied — racking up a cost bill before anyone intervenes. In each case, the failure is not merely an incorrect output that a human can ignore. The failure is an action taken in the world that may be irreversible.

Why Standard Debugging Is Not Enough

Traditional software debugging relies on determinism: given identical inputs, the system produces identical outputs, and a failing test reliably reproduces the bug. Agents violate this assumption in several ways. First, the language model at the core of the agent is stochastic — the same prompt produces different outputs on different runs. A failure observed once may not reappear in a test run. Second, the environment the agent operates in changes: a file that existed when the agent started may be deleted by another process halfway through the task. Third, multi-step agent trajectories compound uncertainty — a small error in step 3 may not become visible until step 11, by which point many actions have already been taken. These properties mean that the usual approach of 'reproduce the bug, fix it, verify the fix' is insufficient for agents.

Compounding Errors

In a 10-step agent trajectory, if each step has a 90% chance of being correct, the probability that all 10 steps are correct is only 0.9^10 = 35%. Error rates that look small per step compound dramatically across a long task. Reliable agents require near-perfect per-step accuracy, aggressive error detection, or frequent human checkpoints.

Three Root Causes

Most agent failures trace back to one of three root causes. The first is model error: the language model at the agent's core misunderstands the task, hallucinates a tool that does not exist, makes a faulty inference, or produces an output that looks plausible but is factually or logically wrong. The second is environment mismatch: the agent's model of the world diverges from the actual world — a file path it expects does not exist, an API has changed its response format, or a dependency is temporarily unavailable. The third is design flaw: the agent's architecture, prompts, or tool set were designed for a slightly different task than the one it ends up performing, creating gaps where failures can occur. Distinguishing these three causes matters because each requires a different remedy.

Match each agent failure description to its root cause.

Terms

The agent calls a tool named 'search_web' that does not exist in its tool registry
The agent expects an API to return JSON but the API was updated to return XML
The agent's prompt says 'help with tasks' but the tool set only supports calendar operations
The agent incorrectly concludes a task is complete when a required step is still pending
A file the agent needs to read was deleted by another process mid-task

Definitions

Environment mismatch — changed state
Environment mismatch — changed interface
Model error — hallucinated tool
Model error — faulty inference
Design flaw — scope misalignment

Drag terms onto their definitions, or click a term then click a definition to match.

An agent is given a 15-step task and each step has an estimated 95% accuracy rate. What is the approximate probability that the agent completes all 15 steps correctly?

Which property of AI agents makes traditional reproduce-and-fix debugging fundamentally insufficient?

Failure Scenario Analysis

  1. You are a reliability engineer reviewing a proposed agent deployment. The agent is designed to help small business owners manage their inventory: it reads inventory levels from a spreadsheet, identifies items below the reorder threshold, and automatically places purchase orders with suppliers via email.
  2. Step 1: Identify at least three specific failure scenarios for this agent. For each, write: (a) what goes wrong, (b) which root cause it falls under — model error, environment mismatch, or design flaw, and (c) whether the failure is reversible or irreversible.
  3. Step 2: Rank your three scenarios from most serious to least serious. Justify your ranking — consider both the magnitude of harm and the difficulty of recovery.
  4. Step 3: For the most serious scenario you identified, propose one change to the agent's design that would prevent or mitigate it. Be specific: what would you add, remove, or change?
  5. Step 4: Share your analysis with a partner. Did they identify scenarios you missed? Do you agree on the ranking? Discuss why reliability engineering requires multiple perspectives.