Skip to main content
Building with AI (Vibe Coding)

⏱ About 20 min20 XP

Evaluating a Build Approach

All of the conceptual content in this module converges on a practical question you will face repeatedly as a builder: given a specific problem, what is the right approach? Vibe coding alone? Vibe coding with additional expert review? A no-code platform? Traditional programming? Buying existing software? Not building at all? The answer depends on factors you can now analyze systematically. This lesson gives you a framework for that analysis and puts it into practice.

The Build Approach Decision Framework

Choosing how to build something requires evaluating five dimensions. No single dimension determines the answer; the full picture of all five does. Dimension 1: Novelty. How similar is this problem to solved problems? If it is a standard task — CRUD web application, data-processing script, form with validation — vibe coding is well-suited; the AI has extensive training data for these patterns. If the problem requires a novel algorithm, original scientific computation, or domain-specific logic with no established pattern, vibe coding alone is unreliable. Honest assessment of novelty requires knowing your domain well enough to know whether 'this type of thing' has been built before. Dimension 2: Stakes. What is the cost of failure? A personal habit tracker that produces wrong counts costs the user a minor annoyance. A medication dosing calculator that produces wrong numbers costs a patient's health. Stakes are determined by: who is affected (just you, or others?), what is affected (data, money, health, safety?), and what is reversible (can you undo the consequences of a failure?). High-stakes systems require rigorous review processes — security audits, compliance checks, professional engineering oversight — that exceed standard vibe coding practice. Dimension 3: Scope complexity. How many interacting components does the system require? A single-purpose script has low scope complexity. A multi-user system with a database, an API, an administrative interface, and external service integrations has high scope complexity. Scope complexity multiplies the risk of integration failures — the category of bug that systems thinking catches and vibe coding without systems thinking misses. Higher scope complexity requires more disciplined decomposition, more explicit data contracts, and more incremental testing. Dimension 4: Evaluability. Can you — the builder — evaluate whether the AI's output is correct? This is the evaluation gap from the previous lesson. If you can read the generated code, understand its logic, test it against realistic inputs, and identify its failure modes, the evaluability is adequate. If you cannot read the code, do not understand the domain well enough to design test cases, or cannot distinguish a correct implementation from a plausible-looking wrong one, evaluability is low. Low evaluability is not a permanent condition — it improves with learning — but it is a current constraint that must be honestly acknowledged. Dimension 5: Maintenance horizon. How long will this software need to run and be updated? A one-time script you run once and discard has a short maintenance horizon. Internal tooling used by your team for years has a long one. Long maintenance horizons amplify the cost of technical debt and the importance of code clarity and architecture quality. AI-generated code that works now but was not designed for maintainability can become expensive to modify over time — especially if the original builder is no longer available.

The Framework Is Not a Formula

These five dimensions do not feed into a formula that outputs a correct answer. They are lenses that structure your judgment. Two experienced builders analyzing the same project might weigh the dimensions differently and reach different defensible conclusions. What the framework prevents is the common failure mode of choosing vibe coding because it is fast without asking whether it is appropriate.

Let us apply the framework to three realistic cases and reason through each. Case A: A graphic design student wants a script that automatically resizes all images in a folder to a standard width, preserving aspect ratio, and saves them to a new folder. Novelty: Very low — image-resizing scripts are among the most common scripts ever written. Stakes: Very low — worst case is some resized images that are wrong; easily undone. Scope: Very low — a single-purpose script with no integrations. Evaluability: High — the output is visible images the student can immediately inspect. Maintenance: Minimal — a one-time tool. Decision: Vibe coding alone is excellent here. This is a textbook appropriate use case. Case B: A small dental practice wants software to manage patient appointment scheduling, storing patient names, contact information, appointment times, and dentist preferences. Novelty: Low — scheduling apps are common patterns. Stakes: Medium-high — patient contact information is regulated by HIPAA in the United States; a data breach has legal and reputational consequences. Scope: Medium — multiple components (calendar, database, possibly email reminders). Evaluability: Medium — the scheduling logic is inspectable, but HIPAA compliance requires specialized knowledge. Maintenance: Long — this will run for years. Decision: Vibe coding is a reasonable starting point for the scheduling logic, but HIPAA compliance review by someone with healthcare IT expertise is non-negotiable before deployment. The dentist should not assume AI-generated code is HIPAA-compliant. Case C: A startup wants to build a real-time recommendation engine that suggests products to users based on their browsing history, personalized per user. Novelty: Medium — recommendation systems have established patterns, but personalization at scale requires performance and algorithmic expertise. Stakes: Medium — errors show wrong products rather than right ones, not immediately dangerous, but the business model depends on accuracy. Scope: High — real-time data pipeline, machine-learning model, API, caching layer, A/B testing infrastructure. Evaluability: Low for most non-specialists — the correctness of a recommendation algorithm is not visually inspectable. Maintenance: Long. Decision: Vibe coding can scaffold individual components and handle routine web app logic, but the recommendation algorithm and performance architecture require traditional engineering expertise. This is a hybrid build, not a pure vibe-coding project.

Match each project characteristic to the decision-framework dimension it maps to.

Terms

The app stores medical records subject to HIPAA
The builder cannot verify if the AI's algorithm is correct
The problem requires a novel signal-processing algorithm
The system will be actively developed and used for five or more years
The app has eight interacting services that share data

Definitions

Evaluability
Stakes
Maintenance horizon
Novelty
Scope complexity

Drag terms onto their definitions, or click a term then click a definition to match.

When Not Building Is the Right Answer

The framework includes one option that builders often overlook: not building at all. If a suitable existing tool, library, or service already exists, building a custom version is often the wrong choice — regardless of build approach. Custom software carries ongoing maintenance costs; every bug, every security patch, every compatibility update is the builder's responsibility. An existing tool has a team maintaining it. The question 'should I build this?' precedes the question 'how should I build this?' Vibe coding lowers the cost of building to the point where it can tempt builders into constructing things that already exist in better-maintained form. Evaluating a build approach includes honestly asking: does this already exist? Is an existing solution adequate for my needs? If the answer is yes, the best build approach is no build at all.

Start With the Search

Before opening an AI assistant to start building, search for existing solutions. A one-hour search that finds an existing tool that is 90% of what you need is almost always better than building from scratch — even with AI assistance. The 10% gap can often be closed with configuration, a small adapter script, or an acceptable compromise.

Using the decision framework, what makes Case B (dental scheduling app) different from Case A (image-resizing script) in terms of build approach recommendation?

A student argues that vibe coding is always the fastest approach, so it should always be used. What does the decision framework reveal about this argument?

Framework Application Workshop

  1. Step 1: Your teacher will assign each group one of the following projects:
  2. Project A: A to-do list app for personal use on your own phone.
  3. Project B: A platform that matches local volunteers with nonprofits, storing volunteer background-check status.
  4. Project C: A script that analyzes your school's public sports schedule and sends you a text when your team's next game is within 24 hours.
  5. Project D: Software that monitors air quality sensors across a city and alerts authorities when readings exceed safe thresholds.
  6. Step 2: Apply all five dimensions of the framework to your assigned project. Score each dimension Low / Medium / High and write a two-sentence justification for each score.
  7. Step 3: Based on your five-dimension analysis, write a build-approach recommendation: pure vibe coding, vibe coding plus specific additional review, hybrid with traditional engineering, or do not build (use existing solution). Justify each part of your recommendation.
  8. Step 4: Present your analysis to the class. After each presentation, the class votes on whether they agree with the recommendation, then discusses disagreements.
  9. The goal is not unanimous agreement — it is rigorous reasoning.