Skip to main content
Building with AI (Vibe Coding)

⏱ About 20 min20 XP

Building Responsibly and Safely

Every piece of software you ship is a decision about what the world looks like for the people who use it. That sentence is not rhetorical. Code that handles personal data, moderates communication, surfaces recommendations, or automates decisions directly shapes people's lives. Responsible builders do not treat safety as an afterthought or a compliance checkbox — they build it into the design from the start.

Safety by Design: The Core Principle

Safety by design is the practice of building protections against harm into a system's architecture and default behavior, rather than adding them after the fact or relying on users to protect themselves. The opposite approach — build first, patch problems later — has a poor track record. When safety is retrofitted, it is typically incomplete, inconsistent, and always more expensive than it would have been if designed in from the beginning. A classic example is the automobile. Early cars had no seatbelts, no crumple zones, and no safety glass. These were added incrementally after massive loss of life. Modern cars begin design with passive safety requirements as structural constraints — the body is engineered to direct crash energy away from occupants before the seats are designed. Software can learn from this trajectory. For software, safety by design means: Minimum necessary data collection: do not collect personal data you do not need. Data you never collect cannot be breached. Default-deny access control: start with all access restricted and explicitly grant what is needed, rather than starting open and trying to close gaps. Input validation everywhere: any data entering your system from a user or external source must be validated and sanitized before use. Unvalidated input is the root cause of the largest class of web vulnerabilities, including SQL injection and cross-site scripting (XSS). Fail safely: when a component fails, it should fail in the least-harmful state. A door lock that fails open is dangerous. A door lock that fails locked is safe.

Threat Modeling: A Precise Definition

Threat modeling is a structured process of identifying what could go wrong with a system before building it — who might attack it, what assets they would target, how they could do it, and what mitigations reduce the risk. The output is not a guarantee of security but a prioritized set of design choices that address the most probable and most harmful threats.

A simple threat modeling framework — STRIDE — is used widely in professional software teams: Spoofing: can an attacker impersonate a legitimate user or system? Tampering: can an attacker modify data in transit or at rest? Repudiation: can a bad actor perform an action and then plausibly deny doing it? Information disclosure: can an attacker access data they should not see? Denial of service: can an attacker make the system unavailable? Elevation of privilege: can an attacker gain more access than they are supposed to have? For a high school project, you do not need a formal security team. But you can walk through STRIDE informally for any feature that handles sensitive data or controls important functionality. Asking 'who could misuse this?' before building it costs minutes. Repairing the damage from not asking can cost far more. AI-specific safety considerations: AI features add novel threat surfaces. A chatbot can be manipulated through prompt injection — where a user supplies text that overrides your system instructions. An AI recommendation system can be exploited through adversarial inputs designed to produce specific outputs. A content moderation model may have blind spots for new formats of harmful content. These risks require the same proactive design thinking as traditional security threats.

Prompt Challenge

Write a prompt for an AI assistant that will help you perform a basic threat model on a software feature you are designing. The prompt should guide the AI to think through who could misuse the feature, what data it touches, and what safeguards should be built in.

Your prompt should…

  • Ask the AI to identify potential misuse scenarios for a specific feature
  • Direct the AI to list what sensitive data the feature collects or processes
  • Instruct the AI to suggest concrete design safeguards for each identified threat

Harm Can Be Indirect and Delayed

The most visible safety failures are direct: a security breach exposes user data. But many harms are indirect, delayed, or systemic. Algorithmic harm: a recommendation system optimized for engagement may learn that outrage and anxiety drive clicks, and surface progressively more extreme content — not because anyone programmed it to cause harm, but because the optimization target (engagement) was misaligned with user wellbeing. Exclusion: a system that works poorly for certain groups — people with accents, names outside the training distribution, low-bandwidth internet connections — inflicts a cost that is diffuse and rarely shows up in aggregate metrics but is real and significant to those affected. Dependency harm: if your service becomes critical infrastructure and you shut it down without warning, the people who depended on it are harmed. Free services that schools or nonprofits embed into their operations, then sunset, cause real disruption. Responsible building means asking not just 'what is the worst-case direct attack?' but 'who could be harmed by normal operation of this system, and how?'

Speed Is Not a Safety Excuse

AI allows you to build faster than ever before. Speed does not reduce your obligation to think through harm. A system that causes harm deployed in three days is not better than one that causes harm deployed in three weeks. The velocity of AI-assisted development makes proactive safety thinking more important, not less.

A developer builds a social platform that optimizes for time-on-site. Six months after launch, research shows it is increasing anxiety among teen users. The developer says he never intended to cause harm. Which principle does this scenario best illustrate?

Which of the following best exemplifies 'safety by design' rather than 'safety by patch'?

STRIDE Walk-Through

  1. Step 1: Choose a feature you have built or plan to build — a login system, a comment form, a file upload, a payment flow, or any feature that interacts with users or data.
  2. Step 2: Apply each of the six STRIDE categories (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) to your feature. For each category, write one sentence: is this threat relevant? If so, how?
  3. Step 3: For any threat you identified as relevant, propose one concrete design change that would reduce the risk.
  4. Step 4: Write a short paragraph: did the STRIDE exercise surface any threats you had not considered? What would it have cost to address those threats after launch rather than before?