Skip to main content
Testing, Quality & Craft
🧪 Testing & CraftLesson 1 of 13

Why Test At All

The real cost of bugs, and how tests give you the confidence to change anything.

Why Test At All

Stage 3 · Testing, Quality & Craft · B.U.I.L.D. letter: D

You shipped it. It worked. Then you changed one line three weeks later and something completely unrelated broke in production. That's not bad luck — that's the absence of tests. Tests are the difference between "I think this still works" and "I know this still works."


⚠️ The vibe trap

Vibe coding got you here — and that's genuinely impressive. You shipped real things by describing what you wanted and iterating fast. But "I clicked around and it seemed fine" isn't testing, it's hoping. The moment your app has real users, real data, and real stakes, hope is not a strategy. Every change you make without tests is a silent gamble that nothing you already built is broken.

The deeper trap: working code that you're afraid to touch is not working code — it's a liability. If you can't refactor, extend, or fix it without dread, the code owns you. Tests are how you take that power back.


💸 The real cost of bugs (it compounds)

A bug caught while you're writing the function costs you 5 minutes. The same bug caught in a code review costs 30 minutes. In QA: 2 hours. In production: 2 days — plus your reputation, plus a midnight incident, plus explaining to users why their data is wrong.

This is not folklore. It's been measured across thousands of projects (the original data is from IBM in the 1970s; modern studies consistently confirm the curve).

Cost to fix a bug, relative to when it's caught:

  During coding         ▓  (1×)
  During code review    ▓▓▓  (6×)
  During QA             ▓▓▓▓▓▓▓▓  (15×)
  After release         ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓  (60×)

                        (Source: NIST, 2002; replicated repeatedly)

Mental model: think of bugs like leaks in a pipe. The further downstream water travels before you find the leak, the more damage it has done and the harder it is to trace back to the source.

Why it matters for you: you're building on top of AI-generated code. That code is often correct and often subtly wrong in ways that don't surface until a specific input hits a specific edge case. Tests are your safety net for both.

Common mistake: "I'll add tests later, once the feature is stable." Later never comes. Write the test the moment you write the function — your future self will thank you with actual words.


🔬 What a test actually is

A test is the simplest thing in the world: run some code, check that the result is what you expected. That's it. No special magic, no framework ceremony.

Here's a function that does a job:

// discountPrice.js
export function discountPrice(originalPrice, percentOff) {
  if (percentOff < 0 || percentOff > 100) {
    throw new Error("percentOff must be between 0 and 100");
  }
  return originalPrice * (1 - percentOff / 100);
}

And here's its first test — using the Arrange → Act → Assert pattern (AAA), the universal shape of every good test:

// discountPrice.test.js  (Jest or Vitest — syntax is identical)
import { discountPrice } from "./discountPrice.js";

describe("discountPrice", () => {
  it("applies a 20% discount correctly", () => {
    // Arrange — set up inputs
    const price = 100;
    const discount = 20;

    // Act — call the thing under test
    const result = discountPrice(price, discount);

    // Assert — verify the outcome
    expect(result).toBe(80);
  });

  it("throws when percentOff is negative", () => {
    // Arrange + Act + Assert in one line when it's this simple
    expect(() => discountPrice(100, -5)).toThrow("percentOff must be between 0 and 100");
  });

  it("returns full price when discount is 0", () => {
    expect(discountPrice(50, 0)).toBe(50);
  });
});

Mental model: a test is executable documentation. Anyone reading these three tests immediately understands what discountPrice is supposed to do, what valid inputs look like, and what invalid inputs should do — without reading a single comment or a README.

Why it matters: when you (or your AI assistant) changes discountPrice six months from now, the tests will tell you in 300ms whether you broke anything. That's the confidence to change code without fear.

Common mistake: writing tests that only test the happy path. The bug almost always lives in the edge cases — negative numbers, empty strings, nulls, boundary values. Test at least one failure mode per function.


🔺 The testing pyramid

Not all tests are equal. They have different costs and different payoffs.

                     /\
                    /  \
                   / E2E\       ← Few (slow, brittle, expensive to write)
                  /------\         "Does the whole app work end-to-end?"
                 /        \
                /Integration\   ← Some (medium speed, test modules together)
               /------------\     "Do these two pieces work when connected?"
              /              \
             /   Unit Tests   \  ← Lots (fast, precise, cheap to write/run)
            /------------------\  "Does this one function do the right thing?"

Mental model: think of it as a quality filter. Unit tests catch 80% of bugs in milliseconds. Integration tests catch the bugs that only appear when two correct pieces are connected incorrectly. E2E tests verify the user experience works end-to-end — but they're slow and they break when you change a button label.

Why it matters: if your entire test suite is E2E tests, every test run takes 15 minutes and breaks constantly. If it's all unit tests with no integration tests, you'll pass every test and still ship a broken app because the pieces don't talk to each other correctly. Balance is the craft.

Common mistake: writing E2E tests for every feature because they "feel more real." They are more real — and 10× harder to maintain. Start with unit tests. Add integration tests when two systems touch. Add E2E tests for your two or three most critical user flows, nothing more.


🗺️ ROI — test what matters most

You don't need 100% test coverage. You need coverage of the code that would hurt the most if it broke.

High ROI to test:                     Low ROI to test:
─────────────────────────────────     ──────────────────────────────
✓ Pricing, billing, discounts         ✗ Pure UI positioning/styling
✓ Auth logic (who can see what)       ✗ Third-party library internals
✓ Data transforms (input → output)    ✗ Auto-generated boilerplate
✓ Error handling / edge cases         ✗ Config files
✓ Anything involving money            ✗ "Obvious" one-liners with no logic
✓ Functions called by many callers

Mental model: imagine your app is a city. The testing pyramid is your inspection system. You don't need inspectors on every street corner — you need them at the water treatment plant, the power grid, and the bridge. Test the critical path first.

Common mistake: chasing a coverage number (100%, 80%, whatever your CI badge says) instead of chasing confidence. A test file with 200 trivial assertions on styling code gives you a green badge and zero protection. Ten focused tests on your checkout flow give you actual safety.


🛠️ Your mission

Take one function from your current project — ideally one that handles data, calculates something, or makes a decision. If you don't have one yet, use discountPrice above.

Write its first test file with these three tests:

  1. The happy path — the normal case with a normal input that should succeed.
  2. An edge case — zero, empty, maximum, or minimum input.
  3. An error case — invalid input that should throw or return an error state.

Run the tests with npx vitest run or npx jest. Watch them pass. Then deliberately break your function (change a + to a - or flip a condition). Watch the tests catch the bug. Undo the break. That moment — tests going red, then green — is the thing this entire track is about.


✅ You're done when…

  • You have a test file with at least three tests covering happy path, edge case, and error case
  • All three tests pass with npx vitest run or npx jest before you commit
  • You've deliberately broken your function and confirmed a test went red, then fixed it and confirmed green
  • Your test names read like sentences that describe behavior ("applies a 20% discount correctly"), not implementation ("calls the function with 100 and 20")
  • You've added your test file to the Pre-Ship Checklist as a required step before any future feature ships

➡️ Next: Your First Real Tests. Build It Right, Or Don't Build It At All. 🏛️

Always-on rigor toolkit

🏛️ Build It Right, Or Don't Build It At All.