Build a Risk Assessment
The vocabulary and frameworks developed in this module are only valuable if they can be applied to real systems. A risk taxonomy that lives only on paper is not AI safety — it is AI safety trivia. In this lesson you will use the complete analytical toolkit from the module to produce a genuine risk assessment of a specific AI system. You will move through each stage of the analysis methodically, make your reasoning explicit, and arrive at defensible recommendations. This is what AI safety analysis looks like in practice.
What a Risk Assessment Contains
A rigorous risk assessment for an AI system covers seven elements. System description: what the system does, how it works at a high level, who deploys it, who uses it, and in what context. Precision here matters — 'an AI chatbot' and 'an AI clinical decision support system used by emergency room physicians to triage patients' are categorically different assessments even if both use similar underlying technology. Misuse risks: specific ways in which malicious actors could use this system to cause harm. For each, identify: who the actor is, what they want to achieve, how the AI capability enables it, and what uplift (if any) the AI provides over the no-AI baseline. Accident risks: specific ways in which the system might fail in unintended ways. For each, identify the failure mechanism (specification problem, distribution shift, reward hacking, adversarial vulnerability, etc.), the harm that would result, and who would bear it. Structural risks: society-level effects that would emerge from broad deployment of this type of system. Think about labor impacts, power concentration, effects on democratic institutions, and long-term autonomy effects. Benefits and counterfactual comparison: what genuine value the system provides, compared to the realistic alternative. Be honest about both the benefits and the counterfactual — if the realistic alternative is also harmful or inefficient, say so. Distribution analysis: who bears the risks and who receives the benefits. Identify any distributional justice concerns. Recommendations: specific, actionable conclusions — deploy, don't deploy, or deploy with specific conditions and safeguards. Justify each recommendation using the analysis above.
The quality of a risk assessment depends entirely on the specificity of the analysis. 'AI could be misused' is not a finding. 'This specific voice-synthesis system could be used to clone a public figure's voice and produce fraudulent political statements because it requires no voice samples longer than 30 seconds and has no identity verification' is a finding. Choose a system with enough public documentation that you can be specific.
Full Risk Assessment: Choose Your System
- You will produce a complete risk assessment for one of the following AI systems. Your teacher may assign a system or allow you to choose. If you propose your own system, it must be a real, currently deployed AI application with publicly available documentation.
- SYSTEM OPTIONS:
- Option A: A large language model assistant deployed as a customer-facing chatbot by a major retail bank — answering customer questions about accounts, loans, and financial products.
- Option B: An AI system used by a school district to predict which students are at risk of dropping out before graduation, used to target counseling and support resources.
- Option C: An AI hiring screening tool that reviews resumes and ranks applicants before any human review, used by a large employer processing over 100,000 applications per year.
- Option D: An AI-powered social media content recommendation system that determines which posts, videos, and accounts each user sees in their feed.
- Option E: A fully autonomous AI system that monitors and responds to cybersecurity threats on a corporate network, able to isolate compromised systems without human authorization.
- YOUR RISK ASSESSMENT MUST INCLUDE ALL SEVEN ELEMENTS:
- 1. System description (1-2 paragraphs): Who builds it, who deploys it, who uses it, what it does, and what AI capabilities it relies on. Be specific about the decision or action the AI is taking or recommending.
- 2. Misuse risks (at least two specific misuse scenarios): For each, name the actor, the intent, the mechanism, and the uplift the AI provides.
- 3. Accident risks (at least three specific failure modes): Identify each by mechanism (specification problem, distribution shift, automation bias, etc.), describe the harm, and name who bears it.
- 4. Structural risks (at least one): Describe a society-level effect from widespread deployment of this type of system. Be specific about the mechanism and who is affected.
- 5. Benefits and counterfactual (1 paragraph): What genuine value does the system provide? What is the realistic alternative, and how does the comparison change your assessment?
- 6. Distribution analysis: Draw a simple two-column table — who bears the primary risks, and who receives the primary benefits. Note any mismatch.
- 7. Recommendations (your conclusion): State clearly whether you recommend deployment as-is, deployment with specific conditions, deployment only in limited contexts, or non-deployment. For each condition or restriction you recommend, explain which specific risk it addresses and why you believe it is effective. Acknowledge the strongest argument against your recommendation.
- Length and format: this is a professional analysis, not a casual reflection. Write in full paragraphs for sections 1-5. Use your teacher's preferred format for the distribution table. Aim for thoroughness over brevity — a complete analysis of a real system typically requires 800-1500 words.
- Peer review: exchange your completed assessment with a partner. Your partner's job is to identify: (a) one risk you did not consider, (b) one place where your benefit assessment may be overstated or understated relative to the counterfactual, and (c) one way your recommendation could be made more specific or actionable. Revise based on this feedback before submitting your final assessment.
Assessment Criteria
A strong risk assessment demonstrates six qualities. Specificity: risks are named precisely, not gestured at. The system is described in enough detail that a reader unfamiliar with it would understand what is being analyzed. Mechanism: for each risk, the causal mechanism is explained — not just that a harm could occur, but how and why. Taxonomic accuracy: risks are correctly classified as misuse, accident, or structural, using the definitions from Lesson 2. Honest benefit accounting: the benefits analysis is genuine, not perfunctory. The counterfactual comparison is realistic. Distributional awareness: the analysis identifies who bears costs and who receives benefits, and notes any justice-relevant mismatches. Actionable recommendations: the conclusions specify what should be done, by whom, and why — not just that 'more research is needed' or 'we should be careful.' A risk assessment that is vague, that treats benefits as irrelevant, or that makes recommendations without connecting them to specific identified risks is incomplete. The goal is analysis that could actually inform a decision.
Three common errors undermine risk assessments. First, listing risks without mechanisms — 'this could be misused' is not a finding. Second, ignoring benefits or setting up a straw-man counterfactual — comparing AI to perfect human performance rather than realistic human performance. Third, recommendations that are not connected to findings — saying 'regulate AI' without specifying which risk the regulation addresses and how. All three produce the appearance of analysis without its substance.
A student's risk assessment of an AI resume-screening tool lists 'the AI might be biased' as an accident risk. A reviewer says this is insufficiently specific to be useful. Which revision best addresses the reviewer's critique?