Skip to main content
AI Agents & Automation

⏱ About 20 min20 XP

Audit an Agent

An audit is a structured, systematic examination of a system against a defined set of criteria. Agent audits are how responsible teams catch problems before users do. In this lesson, you will play the role of an independent reliability auditor: given a detailed description of a deployed agent, you will apply everything you have learned in this module to identify what could go wrong, what is missing, and what recommendations you would make before allowing the agent to expand its deployment.

The Agent Under Audit

The system you are auditing is called GrantBot. It is deployed by a nonprofit organization to help small nonprofits find and apply for grant funding. Here is how it works. When a small nonprofit registers on the platform, they fill out a profile describing their mission, programs, budget, and geographic focus. GrantBot then: searches a curated database of grant opportunities to find matches, ranks those matches by fit, drafts grant application sections (letters of intent, project descriptions, budget narratives) based on the nonprofit's profile, emails the drafted sections to the nonprofit's executive director for review, and — if the executive director clicks an 'Approve and Submit' button in the email — submits the grant application directly to the funder's portal via an automated form-filling tool. GrantBot has the following tools: grant_database_search(query, filters), draft_text(section_type, context), send_email(to, subject, body), submit_form(portal_url, form_data), and read_profile(org_id). The system has a system prompt that says: 'You are GrantBot, an AI assistant helping nonprofits find and apply for grants. Be thorough, accurate, and helpful. Complete tasks efficiently.' The agent has been running for three months with 40 nonprofit clients. The team reports an 82% task success rate (measured as: grant application submitted without reported errors) and no known incidents. They want to expand to 400 clients.

Audit Framework

Structure your audit using the five categories from this module: (1) Failure mode exposure — which of the four failure modes is this agent vulnerable to? (2) Evaluation adequacy — is the current evaluation sufficient to trust the reported 82% success rate? (3) Guardrail coverage — what input and output guardrails exist, and what is missing? (4) Human oversight design — is the human-in-the-loop mechanism well-designed? (5) Deployment readiness — is the staged rollout plan and incident response infrastructure adequate for a 10x expansion?

Full Agent Audit: GrantBot

  1. You are an independent reliability auditor. Your client is the nonprofit that built GrantBot. They have asked for a formal audit report before they expand from 40 to 400 clients. Your deliverable is a written audit report with five sections.
  2. SECTION 1 — FAILURE MODE ANALYSIS
  3. For each of the four failure modes (infinite loop, hallucinated tool, runaway cost, goal drift), assess GrantBot's vulnerability. For each: (a) Describe a specific, realistic scenario in which this failure mode could occur for GrantBot. (b) Rate the likelihood as Low, Medium, or High, and justify your rating. (c) Rate the severity of harm to the affected nonprofit as Low, Medium, or High, and justify.
  4. SECTION 2 — EVALUATION CRITIQUE
  5. Critique the current evaluation approach. The team measures success as 'grant application submitted without reported errors.' (a) What does this metric miss? Name at least three important quality dimensions it does not capture. (b) Propose a more complete evaluation framework with specific metrics. (c) Identify at least two types of eval cases that are almost certainly absent from their current testing.
  6. SECTION 3 — GUARDRAIL REVIEW
  7. Review the guardrail architecture. (a) Identify at least three significant risks that have no guardrail currently in place, based on the tools and system prompt described. (b) For each risk, propose a specific guardrail: name the type (input/output/infrastructure), describe exactly what it checks or blocks, and explain at which layer it should be enforced. (c) Apply least privilege: which of GrantBot's five tools should require additional access controls, and what would those controls look like?
  8. SECTION 4 — HUMAN OVERSIGHT ASSESSMENT
  9. Evaluate the 'Approve and Submit' gate. (a) What information should be in the approval email to enable a genuine review in under 5 minutes? Does the current description suggest this information is present? (b) What is the risk of automation bias in this deployment, and what countermeasure would you recommend? (c) Is 'click to approve and submit' a sufficient confirmation mechanism for an irreversible action like submitting a grant application? What would you change?
  10. SECTION 5 — DEPLOYMENT READINESS
  11. Assess readiness for 10x expansion. (a) What evidence would you require before approving the expansion from 40 to 400 clients? List at least four specific criteria. (b) Design a staged rollout plan for the expansion: at least three stages with explicit advancement criteria and rollback triggers. (c) What incident response infrastructure must exist before the expansion? List the minimum requirements.
  12. AUDIT SCORING: For each of the five sections, rate the current state as: Red (serious problems requiring fixes before expansion), Yellow (concerns that should be addressed but do not block expansion), or Green (acceptable for expansion). Write a one-sentence overall recommendation.

What Good Audits Look Like

A high-quality audit report shares several characteristics regardless of the system being audited. It is specific: it names precise failure scenarios, not vague concerns. It is evidence-based: every finding is supported by a specific aspect of the system description, not a general feeling. It is constructive: every problem identified comes with a concrete recommendation, not just a flag. And it is proportionate: the severity of findings is calibrated to the actual likelihood and impact of harm — auditors who flag everything as critical lose credibility and cause important problems to be overlooked. The hardest skill in auditing is distinguishing genuine risks from theoretical ones. A hallucinated tool failure is a genuine risk for GrantBot because its system prompt is vague and its tool list is not explicitly constrained in the prompt. An infinite loop triggered by a cosmic ray flipping a bit is a theoretical risk that does not merit a finding. Experienced auditors develop judgment about this threshold through practice — which is why doing this audit, and getting feedback on your findings, is more valuable than reading about the framework.

GrantBot submits a grant application to a funder's portal without the executive director's approval because the 'Approve and Submit' email was delivered to the spam folder and GrantBot interpreted the absence of a rejection after 72 hours as implicit approval. Which design flaw does this most directly expose?

During the audit, you discover that GrantBot's success metric ('application submitted without reported errors') is self-reported by the nonprofits. What is the most important concern with this measurement approach?