Relational Databases & SQL
Stage 3 · Data & Databases · B.U.I.L.D. letter: U
You already know how to build a UI that feels alive. Now let's talk about the layer underneath everything — the one that makes data stick around after the page reloads, survives server restarts, and scales to a million users. That layer is almost always a relational database, and the language you use to talk to it is SQL. It is 50 years old and it is still the most important skill you can add to your engineering toolkit right now.
⚠️ The vibe trap
Vibe coding gets you far: you describe what you want, an AI generates a SQL query, you paste it in and it works. Amazing. But when it stops working — wrong results, deleted rows you didn't mean to delete, a JOIN that silently returns duplicates — you have no idea where to look. SQL you cannot read is SQL you cannot fix. This lesson gives you the mental model so you are never stuck staring at a query you are afraid to touch.
🗂️ What a relational database actually is
A relational database is a collection of tables. Each table is like a spreadsheet with a fixed set of typed columns and any number of rows. Every row represents one thing (one user, one order, one product). Every column represents one fact about that thing (their email address, the order total, the product price).
-- This is what a users table looks like in your head:
-- id | email | display_name | created_at
-- ----|------------------------|---------------|--------------------
-- 1 | alex@example.com | Alex | 2025-01-10 09:00:00
-- 2 | jordan@example.com | Jordan | 2025-01-11 14:30:00
-- 3 | sam@example.com | Sam | 2025-01-12 08:15:00
Mental model: A table is a contract. Every row in the table has exactly the same columns, and every column holds exactly the type of data you declared when you created the table. The database enforces that contract for you — so you never end up with a row where id is the string "banana".
Why this matters: Unlike a JSON file or a JavaScript object where every item can have different keys, a relational table is uniform. That uniformity is what makes it possible to query millions of rows fast.
Common mistake: Treating a database table like a JavaScript array of objects where different rows can have different shapes. They cannot. If you need variable shape, there is a column type for that (jsonb in Postgres) — but that is a conscious choice, not the default.
🏗️ CREATE TABLE — defining your contract
Before you can store anything, you declare the shape of your table. This is called a schema. You define each column's name, its data type, and any constraints.
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
sku TEXT NOT NULL UNIQUE,
price_cents INTEGER NOT NULL,
in_stock BOOLEAN NOT NULL DEFAULT true,
description TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
Let's walk through each piece:
| Piece | What it does |
|---|---|
SERIAL | Auto-incrementing integer — the DB assigns the next number for you |
PRIMARY KEY | Every row gets a unique identifier; no two rows share an id |
NOT NULL | The column must have a value — the DB rejects a row without it |
UNIQUE | No two rows can have the same value in this column |
DEFAULT true | If you insert a row without specifying this column, it gets true |
TEXT | Variable-length string (no length limit needed in Postgres) |
INTEGER | Whole number. Store money as cents — never as a float |
TIMESTAMPTZ | Timestamp with timezone. Always use this, never TIMESTAMP alone |
Mental model: CREATE TABLE is a promise to the database. You are saying "every product will have a name, a SKU, and a price — no exceptions." The database holds you to that promise on every INSERT.
Why this matters: Constraints are free bug prevention. NOT NULL means your application code never has to defensively check if product.name is not None everywhere — the DB already guaranteed it exists.
Common mistake: Skipping NOT NULL on columns you actually always want. Every nullable column is a place where null can sneak into your application and cause a crash at 2 a.m.
➕ INSERT — putting data in
-- Insert one product
INSERT INTO products (name, sku, price_cents, description)
VALUES ('Wireless Headphones', 'WH-1001', 4999, 'Over-ear, noise cancelling');
-- Insert multiple products at once
INSERT INTO products (name, sku, price_cents, in_stock)
VALUES
('USB-C Hub', 'HUB-200', 2499, true),
('Laptop Stand', 'STD-300', 3499, true),
('Webcam 1080p', 'CAM-400', 5999, false);
Notice what we did not include: id and created_at. Both have defaults — SERIAL auto-generates the id and DEFAULT now() fills in the timestamp. The database handles them.
Mental model: INSERT tells the table "here is a new row." You list the columns you are providing values for, then list those values in the same order. Columns with defaults or SERIAL can be omitted entirely.
Why this matters: You almost never INSERT an id yourself. If you do, you risk collisions. Let the database's auto-increment handle it.
Common mistake: Inserting in a different column order than your VALUES order. INSERT INTO products (sku, name, price_cents) VALUES ('A', 100, 'Widget') silently stores 100 as the sku. Always name your columns explicitly in the INSERT.
🔍 SELECT — reading data back
SELECT is the verb you will write the most. It asks the database "give me some rows from a table."
-- Get every column from every product
SELECT * FROM products;
-- Get only the columns you need (prefer this over *)
SELECT id, name, price_cents FROM products;
-- Filter with WHERE
SELECT id, name, price_cents
FROM products
WHERE in_stock = true;
-- Combine conditions
SELECT id, name, price_cents
FROM products
WHERE in_stock = true
AND price_cents < 5000;
-- Sort results
SELECT id, name, price_cents
FROM products
WHERE in_stock = true
ORDER BY price_cents ASC;
-- Limit how many rows come back
SELECT id, name, price_cents
FROM products
ORDER BY created_at DESC
LIMIT 10;
Mental model: Think of SELECT as a filter and a projector. WHERE is the filter — it keeps only the rows that match. The column list after SELECT is the projector — it keeps only the columns you asked for. Everything else is thrown away before the data reaches your app.
Why this matters: SELECT * in production code is a bad habit. It sends every column across the network even if your app only uses two of them. Name your columns — your future self will also know exactly what the query returns without running it.
Common mistake: Forgetting WHERE on a query meant to return one row. SELECT * FROM products returns every product. SELECT * FROM products WHERE id = 7 returns the one you wanted. Without WHERE, you read the whole table.
✏️ UPDATE and 🗑️ DELETE — changing and removing data
These two verbs modify existing rows. They are powerful and they respect WHERE the same way SELECT does.
-- Update the price of one product
UPDATE products
SET price_cents = 3999
WHERE id = 1;
-- Mark all headphones as out of stock
UPDATE products
SET in_stock = false
WHERE name LIKE '%Headphones%';
-- Update multiple columns at once
UPDATE products
SET price_cents = 4499,
description = 'Over-ear, noise cancelling, updated model'
WHERE sku = 'WH-1001';
-- Delete a specific product
DELETE FROM products
WHERE id = 3;
-- Delete all out-of-stock items (dangerous — test your WHERE first)
DELETE FROM products
WHERE in_stock = false;
Mental model: UPDATE and DELETE affect every row that matches the WHERE clause. There is no undo. If your WHERE matches 50,000 rows, all 50,000 change.
Why this matters: This is where things go wrong in production. A missing WHERE on a DELETE hits every row in the table. A typo in a WHERE on an UPDATE corrupts half your data.
Common mistake — the most important one in this lesson:
-- ❌ DO NOT run this. It deletes every row in the table.
DELETE FROM products;
-- ✅ Always include WHERE unless you genuinely mean "all rows"
DELETE FROM products WHERE id = 3;
Before running any UPDATE or DELETE, run the equivalent SELECT first and confirm the row count is what you expect:
-- Step 1: Check what you are about to delete
SELECT id, name FROM products WHERE in_stock = false;
-- Step 2: If the count looks right, then delete
DELETE FROM products WHERE in_stock = false;
💡 SQL is declarative — and that changes how you think
Most code you write is imperative: "do this, then do that, then loop over these." SQL is declarative: you describe the result you want, and the database figures out how to get it.
-- You are not saying "start at row 1, check if in_stock is true,
-- if yes add it to a list, go to row 2..."
-- You are saying: "I want the name and price of every in-stock product."
SELECT name, price_cents
FROM products
WHERE in_stock = true
ORDER BY price_cents DESC;
The database's query planner decides whether to scan every row, use an index, or do something cleverer. You just describe what you want. This is why SQL scales — the same query on 100 rows and 10 million rows is the same SQL. The database handles the optimization.
Why this matters: Once this mental shift clicks, SQL stops feeling like a weird language and starts feeling like a superpower. You stop thinking "how do I loop through the database" and start thinking "what shape is the data I need?"
🛠️ Your mission
Your app has a main table — users, posts, products, bookings, whatever it stores. Write the following SQL statements against it. If your table does not exist yet, write the CREATE TABLE first.
-
CREATE TABLE — define your main table with at least 5 columns. Include
id SERIAL PRIMARY KEY, at least twoNOT NULLcolumns, one column with aDEFAULT, and acreated_at TIMESTAMPTZ NOT NULL DEFAULT now(). -
INSERT three realistic rows into your table. Name every column explicitly — do not rely on column order.
-
SELECT all rows, then write a second SELECT that filters with
WHEREto return only rows matching a real business condition (e.g., only published posts, only active users, only in-stock products). Add anORDER BY. -
UPDATE one row by its
id. Change at least two columns in a single UPDATE statement. -
DELETE one row by its
id. Before writing the DELETE, write the SELECT that would confirm you have the right row. -
Write one more SELECT that uses both
WHEREandLIMITtogether — the kind of query you would run to power a "recent items" feed.
✅ You're done when…
- Your CREATE TABLE compiles without errors and includes PRIMARY KEY, at least two NOT NULL columns, and a DEFAULT (tick off column types and constraints against the Data Modeling Cheat Sheet)
- Your INSERT statements name columns explicitly and do not include the
idcolumn - Your SELECT with WHERE returns a subset of rows, not the whole table
- Your UPDATE targets a specific row by
idand touches at least two columns - You ran the SELECT check before your DELETE and confirmed the row count
- You can explain in one sentence why SQL is declarative and what that means for performance
➡️ Next: SQL, Deeper. Build It Right, Or Don't Build It At All. 🏛️