Skip to main content
Building with AI (Vibe Coding)

⏱ About 20 min20 XP

What the AI Does and Doesn't Do

One of the fastest ways to fail at vibe coding is to misunderstand what the AI is actually doing. Overestimate its capabilities and you ship broken software without realizing it. Underestimate them and you do work by hand that the AI could handle, wasting the tool's value. This lesson maps the division of labor precisely — not as a theoretical exercise, but as a practical necessity for anyone who wants to build something that works.

What the AI Does

A capable AI code-generation model does several things well, provided the intent specification is adequate. Pattern matching at scale. The model has been trained on an enormous corpus of human-written code — effectively the combined output of millions of programmers across hundreds of languages and frameworks. It has seen how similar problems have been solved before, and it retrieves and adapts those patterns in response to your description. This is why it can produce working boilerplate for common tasks — authentication flows, database queries, file-parsing routines — with high reliability. These are problems that have been solved the same way thousands of times; the AI has seen them all. Syntax translation. Given a clear description of logic, the AI can produce syntactically correct code in most major languages. It does not forget semicolons, miscount parentheses, or reverse loop conditions — at least not often. This is the clearest and most reliable part of what it does. Structural scaffolding. For larger projects, the AI can generate the skeleton of a system: file structure, class definitions, function signatures, module organization. This scaffold is not always the best possible design, but it gives the human something concrete to evaluate and modify rather than a blank screen. Explanation and documentation. The AI can explain code it has generated, describe what a function does, generate comments, and write documentation — often more thoroughly and patiently than a human would.

What the AI Is Actually Doing

AI code generation is sophisticated pattern completion, not problem-solving from first principles. The model predicts what code is likely to follow a given prompt based on its training data. When your problem closely resembles problems in that training data, predictions are accurate. When your problem is novel, domain-specific, or requires reasoning the training data does not cover well, accuracy drops. Knowing this shapes how you should use the tool.

What the AI does not do is equally important to understand. The AI does not understand your actual goals. It processes your description as text and predicts a plausible response. If your description is ambiguous, the AI produces something plausible — not necessarily something correct. It has no model of your business, your users, or your constraints beyond what you put in the prompt. The AI does not verify that the code runs correctly. It generates code that looks like it should work based on patterns it has seen. It does not execute the code, does not test it against your data, and does not know whether your API endpoints exist or your database schema matches what it assumed. The AI does not make architectural decisions with your interests in mind. It will suggest a technology stack, a data structure, an API design — but those suggestions are pattern matches from its training data, not reasoned analysis of your specific requirements, team capabilities, or long-term maintenance needs. The AI does not learn from your session. Each prompt begins from the model's training, not from accumulated understanding of your project (unless you explicitly provide context in the prompt or through a memory mechanism). It can produce contradictory code in two different responses without recognizing the contradiction. The AI does not flag security issues unless prompted to. It will generate SQL queries vulnerable to injection attacks, forms without input validation, APIs without authentication — not because it cannot recognize these problems if asked, but because it optimizes for answering the question you asked, not the question you should have asked.

Prompt Challenge

Write a prompt asking an AI to build a user registration form, that explicitly asks the AI to flag any security concerns in the code it generates.

Your prompt should…

  • specifies the form fields and their validation rules
  • requests that the AI identify security vulnerabilities in its own output
  • describes the backend endpoint that will receive the form data

The Human's Non-Negotiable Responsibilities

Given this division of labor, certain responsibilities cannot be delegated to the AI — not because AI is incapable of performing them someday, but because delegating them without verification is precisely what creates broken or dangerous software. Goal clarity. Only the human knows what the software actually needs to do, for whom, and in what context. The AI cannot know this; it must be told. Requirements completeness. The human must think through what the complete requirements are — not just the happy path, but error conditions, edge cases, and security requirements. The AI will not invent requirements the human did not specify. Output verification. The human must test whether the generated code does what was intended. This means running it, feeding it realistic inputs, checking outputs, and probing for failures. The AI's confidence in its own output is not evidence of correctness. Architectural judgment. For anything beyond a single script, the human must make or validate decisions about structure, data models, and technology choices. The AI's suggestions should be understood as options, not answers. Security and privacy responsibility. The human deploying the software is legally and ethically responsible for protecting users. AI-generated code must be reviewed for security vulnerabilities before deployment to real users.

Confidence Without Correctness

AI models generate text with consistent apparent confidence regardless of whether the output is correct, secure, or even runnable. A model that generates a wrong answer looks exactly like one that generates a right answer. This is not a flaw unique to low-quality models — it is a fundamental property of how these systems work. Never treat fluency of output as evidence of correctness.

Why does an AI code-generation model perform reliably on common tasks like authentication flows, but less reliably on novel domain-specific problems?

A vibe coder says, 'The AI generated the code without any errors, so it should be secure.' What is wrong with this reasoning?

The Division of Labor Audit

  1. Step 1: Imagine you have used an AI to generate a simple web form that lets users submit a message, which gets stored in a database and emailed to you.
  2. Step 2: Create two columns on paper: 'AI handled this' and 'I must verify this.'
  3. Step 3: Work through the following concerns, placing each in the correct column: form fields are syntactically correct HTML; the form actually sends data to the right endpoint; submitted messages are stored without injecting malicious SQL; user email addresses are validated; the email-sending code uses real credentials; the form works on mobile browsers; the form does not allow someone to submit 10,000 messages per minute to spam you.
  4. Step 4: For every item in 'I must verify this,' write one specific test or check you would perform to verify it.
  5. Discuss: how does this exercise change your view of what 'done' means when building with AI?