Threat Modeling
Stage 3 · Auth, Identity & Security · B.U.I.L.D. letter: L
You are three hours from shipping a new file-upload feature. The demo is beautiful, the PM is thrilled, and an attacker who has never seen your codebase just spent thirty minutes writing a script that will own your server the moment it goes live. You did not think like them before you wrote the code. They are thinking like you right now.
⚠️ The vibe trap
Vibe coding is a superpower for building fast, and fast builders naturally think about the happy path — the user uploads a file, the file saves, everyone wins. Security is the unhappy path: what does the system do when someone is actively trying to break it? Bolting on security after a feature ships is like adding a lock to a door that already has a window cut in it. Threat modeling is the habit of asking "what can go wrong?" before you write a line of code, so you find the window before the attacker does.
🗺️ The Four Questions Every Threat Model Answers
Before any diagram or acronym, threat modeling is just a structured conversation around four questions. The questions come from Adam Shostack's foundational work at Microsoft and have been used in production teams at Amazon, Google, and thousands of startups since.
| # | Question | What it produces |
|---|---|---|
| 1 | What are we building? | A data-flow diagram or written description of the system |
| 2 | What can go wrong? | A list of threats, one per row in a STRIDE table |
| 3 | What will we do about it? | Mitigations, design changes, or accepted risks |
| 4 | Did we do a good job? | A review pass — ideally with a second set of eyes |
Mental model. Think of this as a pre-mortem. You are inviting the disaster into the room before the disaster is real, so you can decide how much of it you are willing to accept and under what conditions.
Why this order matters. Question 2 is useless without Question 1. You cannot enumerate threats against a system you have not described. Teams that skip straight to "what can go wrong?" produce generic advice ("don't get hacked") rather than actionable tickets. Draw the system first, even if it is rough.
Common mistake. Running the threat model once at design time and never touching it again. The model should live next to the code and get updated every time a new integration, new user role, or new data type is added. A stale threat model is worse than none — it creates false confidence.
🏗️ Trust Boundaries and Data-Flow Diagrams
A data-flow diagram (DFD) is not a fancy architecture diagram. It is a map of where data comes from, where it goes, and what walls it crosses on the way. The walls are called trust boundaries — the lines where the trust level of data changes. Data crossing a trust boundary is almost always where something interesting (and dangerous) happens.
The five elements of a DFD:
| Symbol | Name | Example |
|---|---|---|
| Rectangle | External entity | Browser, mobile app, third-party API |
| Rounded box | Process | Your API server, a Lambda function |
| Open-ended box | Data store | PostgreSQL, S3 bucket, Redis cache |
| Arrow | Data flow | HTTP request, SQL query, file write |
| Dashed line | Trust boundary | Internet edge, service perimeter, DB subnet |
Example — a minimal file-upload feature as a text DFD:
[Browser (untrusted)]
|
| HTTPS POST /upload (crosses trust boundary: internet → your server)
↓
[API Server] ← reads req.user from JWT (crosses trust boundary: token → session)
|
| write file metadata (crosses trust boundary: app → DB)
↓
[PostgreSQL]
|
| store file bytes (crosses trust boundary: app → cloud storage)
↓
[S3 Bucket]
|
| presigned GET URL → Browser (crosses trust boundary: cloud storage → internet)
↓
[Browser (untrusted)]
Every arrow that crosses a dashed line is a candidate for a threat. Count them: there are five data flows crossing boundaries in this tiny feature. That is five places an attacker can probe.
Why it works. Drawing the DFD forces you to name every actor, every store, and every boundary explicitly. Teams that skip this step routinely discover in post-mortems that "we never thought about X getting access to Y" — because they never drew the line between X and Y in the first place.
Common mistake. Drawing the DFD at a level of abstraction so high that trust boundaries disappear ("the app talks to the database"). Get specific enough that each arrow represents one type of interaction. If an arrow crosses a firewall, a VPC boundary, or an authentication check, the dashed line belongs there.
🧠 STRIDE: A Threat Checklist You Can Use on Any Feature
STRIDE is a mnemonic developed at Microsoft that covers the six categories of threats that show up again and again across almost every type of software system. It is not an exhaustive taxonomy; it is a checklist you run against each data flow and process in your DFD to make sure you have not missed an obvious category.
| Letter | Threat | What it means | Violated security property |
|---|---|---|---|
| S | Spoofing | Attacker pretends to be a legitimate user or service | Authentication |
| T | Tampering | Attacker modifies data in transit or at rest | Integrity |
| R | Repudiation | User denies performing an action; no proof exists | Non-repudiation |
| I | Information Disclosure | Data leaks to someone who should not see it | Confidentiality |
| D | Denial of Service | System is made unavailable to legitimate users | Availability |
| E | Elevation of Privilege | User gains capabilities beyond what they were granted | Authorization |
Mental model. Run STRIDE like a checklist, not a creativity exercise. For each process and each data flow in your DFD, ask "is there a plausible Spoofing threat here? A Tampering threat?" If yes, write it down. If no, write "N/A — here is why." The discipline of ruling things out is as valuable as finding them.
Why it works. Without a checklist, humans naturally threat-model the threats they are already afraid of (usually the one they read about last week). STRIDE forces you to cover the categories you tend to forget. Repudiation and Denial of Service in particular are chronically under-modeled by solo developers.
Common mistake. Treating STRIDE as a pass/fail gate — "we checked it, it's fine." STRIDE is an enumeration tool, not a scoring system. The goal is a list of threats, not a grade.
📊 STRIDE Applied: File Upload Feature
Here is a STRIDE table for the file-upload feature sketched in the DFD above. Each row is one threat. This is the artifact you produce in Question 2 and bring into Question 3.
Feature: File Upload (POST /upload → S3)
| # | Category | Threat | Likelihood | Impact | Priority |
|---|---|---|---|---|---|
| 1 | Spoofing | Attacker forges a JWT to upload as another user | Medium | High | HIGH |
| 2 | Tampering | Attacker replaces a stored file via direct S3 URL if bucket is public-write | Low | Critical | HIGH |
| 3 | Tampering | Attacker uploads a file with a malicious filename (../../../etc/passwd) | High | High | HIGH |
| 4 | Repudiation | No audit log of who uploaded what; user denies uploading malicious content | High | Medium | MEDIUM |
| 5 | Info Disclosure | S3 presigned URL forwarded or leaked; unauthorized user downloads file | Medium | High | HIGH |
| 6 | Info Disclosure | Error response leaks internal S3 bucket name or file path | Medium | Low | LOW |
| 7 | Denial of Service | Attacker uploads 10 GB files in a loop, exhausting storage quota | High | High | HIGH |
| 8 | Denial of Service | Attacker uploads 10,000 tiny files per second, exhausting API rate | High | Medium | HIGH |
| 9 | Elevation of Privilege | Uploaded SVG/HTML file served directly; executes scripts in victim's browser | Medium | Critical | HIGH |
Prioritizing with likelihood × impact. The numbers in the Priority column come from a simple mental model: if both likelihood and impact are high, the threat is HIGH priority and needs a mitigation before you ship. If likelihood is low and impact is also low, you may log it as accepted risk and revisit in the next sprint. Write your reasoning down — "we accept this risk because X" is a valid outcome of a threat model; ignoring it silently is not.
Common mistake. Assigning every threat the same priority. If everything is HIGH, nothing is HIGH. Force yourself to put at least a few items in LOW or ACCEPTED RISK. If you genuinely believe everything is critical, you probably have not been specific enough about likelihood.
🛡️ Threats → Mitigations → Tickets
A threat model without concrete mitigations is a philosophical document. The output of Question 3 is a table that maps each threat to a specific, implementable action — something you can file as a ticket, assign, and close.
Feature: File Upload — Mitigations Table
| Threat # | Mitigation | Implementation note | Effort |
|---|---|---|---|
| 1 — JWT spoofing | Verify signature on every request; reject expired tokens; rotate signing key quarterly | jsonwebtoken.verify(token, process.env.JWT_SECRET) — already in auth middleware; confirm it covers upload route | Low |
| 2 — S3 public-write | Set S3 bucket ACL to private; use IAM role for server-side writes only | Block all public ACLs at the AWS account level via S3 Block Public Access setting | Low |
| 3 — Path traversal in filename | Strip all path separators from filename server-side; generate a UUID-based storage key instead of using the user-supplied name | const storageKey = \uploads/${req.user.id}/${crypto.randomUUID()}${path.extname(original)}`` | Low |
| 4 — No audit log | Insert a row to upload_events (user_id, storage_key, timestamp, ip_address) on every upload | Add to the existing db.insert call; include in migration 068 | Low |
| 5 — Presigned URL leakage | Set presigned URL expiry to 15 minutes; never log full URLs; add user_id claim to URL via S3 tagging | s3.getSignedUrl('getObject', { Expires: 900 }) | Low |
| 6 — S3 path in error | Catch S3 errors server-side; return generic "Upload failed" to client | Wrap S3 call in try/catch; log detail server-side only | Low |
| 7 — Storage exhaustion | Enforce per-user quota in DB before writing; reject files over 50 MB at the HTTP layer | if (req.headers['content-length'] > 50_000_000) return res.status(413) | Medium |
| 8 — Rate exhaustion | Apply rate limiter to /upload: 20 requests per user per minute | express-rate-limit keyed on req.user.id | Low |
| 9 — XSS via SVG/HTML | Never serve uploaded files from your app domain; serve from a separate S3 domain with Content-Disposition: attachment; validate MIME type on server (not client) | Block text/html, image/svg+xml MIME types; use fileType library to detect real type from magic bytes, not extension | Medium |
One mitigation in code — audit log (Threat #4):
// Insert into upload_events on every successful upload (addresses Repudiation threat)
await db.query(
`INSERT INTO upload_events (user_id, storage_key, ip_address, uploaded_at)
VALUES ($1, $2, $3, NOW())`,
[req.user.id, storageKey, req.ip]
);
Why the effort column matters. Half of these mitigations are one-line fixes. Teams that skip threat modeling end up discovering these threats in production, where the "one-line fix" now requires an incident response process, customer notification, and a post-mortem. The cost of finding a threat at design time is measured in minutes. The cost of finding it in a breach notification is measured in trust.
Common mistake. Writing mitigations that are too vague to act on — "add validation" or "improve security." A good mitigation names the specific code change, the specific configuration, or the specific policy. If you cannot assign it to an engineer as a ticket, it is not specific enough.
🛠️ Your mission
Pick one feature in your current project that handles user input and produces a persistent side-effect (a write to a database, a file upload, a payment, an email send — anything with consequences). Run the full threat model on it:
Step 1 — Draw the DFD. Sketch it as text, a comment block, or a markdown table. Name every external entity, every process, every data store, and every trust boundary the data crosses.
Step 2 — Fill in the STRIDE table. Go row by row. For each of the six categories, write down at least one plausible threat or explicitly note "N/A — [reason]." Assign a likelihood and impact.
Step 3 — Build the mitigations table. For every HIGH-priority threat, write a concrete mitigation with enough detail to become a ticket. For MEDIUM threats, write the mitigation and note whether you are implementing it now or deferring.
Step 4 — Add every unresolved mitigation to your Security Audit Checklist. The checklist is your running record of what you committed to and what you shipped.
The deliverable is two markdown tables (STRIDE + mitigations) in a THREAT-MODEL.md file at the root of your feature branch. It does not need to be long. Nine rows — like the example above — is a complete, professional threat model for a single feature.
✅ You're done when…
- You have a
THREAT-MODEL.md(or equivalent section in your Security Audit Checklist) for at least one feature, containing a DFD, a STRIDE table, and a mitigations table - Every HIGH-priority threat in your STRIDE table has a corresponding mitigation with a specific implementation note
- Every unresolved mitigation from your threat model appears as an open item in your Security Audit Checklist so it cannot be forgotten at ship time
- You have assigned a likelihood and impact to each threat — at least one item is rated LOW or ACCEPTED RISK with a written reason
- Your team (or a second reader) has reviewed the model — you answered Question 4 ("did we do a good job?") with someone other than yourself
➡️ Next: Running a Real Security Audit. Build It Right, Or Don't Build It At All. 🏛️