Relationships in Practice
Stage 3 · Data & Databases · B.U.I.L.D. letter: U
You built the schema. You picked your types. Now your data needs to talk to itself — and that conversation is held through relationships. Get this right, and every query becomes a simple conversation between tables. Get it wrong, and you'll spend the rest of your life writing application code to untangle messes that the database could have solved for free.
⚠️ The vibe trap
The classic move when you're moving fast: you have a user and they can belong to multiple teams, so you store team_ids = [3, 7, 12] in a TEXT column. It works until you need to query "give me all users on team 7" — now you're doing LIKE '%7%' and praying nobody has a team 17. You can't index it cleanly, you can't enforce referential integrity, and every JOIN you write against it is a lie. Arrays of IDs in a column is a relationship pretending to be data. Give it a real table and let the database do what it was built for.
🔗 One-to-Many: The Workhorse
One author writes many posts. One team has many members. One order contains many line items. This is the most common relationship in any real system, and the rule is simple: the foreign key lives on the "many" side.
-- One author, many posts.
-- The FK goes on posts, NOT on authors.
CREATE TABLE authors (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
author_id INT NOT NULL REFERENCES authors(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT NOT NULL,
published BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Read: all posts with their author's name
SELECT
p.id,
p.title,
a.name AS author_name,
p.published
FROM posts p
JOIN authors a ON a.id = p.author_id
ORDER BY p.created_at DESC;
-- Read: one author's published posts only
SELECT p.title, p.created_at
FROM posts p
WHERE p.author_id = 42
AND p.published = true;
Mental model. Draw the arrow from the many side to the one side. posts.author_id → authors.id. The column that holds the arrow is always on the table with more rows.
Why it works this way. If you tried to store the relationship on authors (e.g., a post_ids array), you'd have the vibe trap from above. Putting it on posts means every post row is self-describing — you always know exactly which author it belongs to, and the database can enforce that the author actually exists.
Common mistake. Putting the FK on the wrong side. If you write authors.current_post_id you've modeled a one-to-one, not a one-to-many. When in doubt: which side has more rows? That side gets the FK column.
🔀 Many-to-Many: The Junction Table
A student can enroll in many courses. A course can have many students. Neither side "owns" the relationship — it lives in a third table that sits between them. This is called a junction table (also: join table, pivot table, associative table).
CREATE TABLE students (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE courses (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
code TEXT NOT NULL UNIQUE
);
-- The junction table.
-- Its PK is the combination of both FKs — no duplicate enrollments.
CREATE TABLE enrollments (
student_id INT NOT NULL REFERENCES students(id) ON DELETE CASCADE,
course_id INT NOT NULL REFERENCES courses(id) ON DELETE CASCADE,
enrolled_at TIMESTAMPTZ NOT NULL DEFAULT now(),
grade CHAR(2), -- can be NULL until graded
PRIMARY KEY (student_id, course_id) -- composite PK prevents duplicates
);
-- Read: all courses a given student is enrolled in
SELECT
c.code,
c.title,
e.enrolled_at,
e.grade
FROM enrollments e
JOIN courses c ON c.id = e.course_id
WHERE e.student_id = 7;
-- Read: all students in a given course
SELECT
s.name,
e.enrolled_at,
e.grade
FROM enrollments e
JOIN students s ON s.id = e.student_id
WHERE e.course_id = 101
ORDER BY s.name;
-- Read: full enrollment roster with both names
SELECT
s.name AS student,
c.title AS course,
e.grade
FROM enrollments e
JOIN students s ON s.id = e.student_id
JOIN courses c ON c.id = e.course_id
ORDER BY s.name, c.title;
Mental model. The junction table is not a hack — it IS the relationship. It can carry extra meaning too (notice grade and enrolled_at live there, not on either parent table, because they describe the relationship, not the student or the course).
Why it works this way. The composite PRIMARY KEY (student_id, course_id) does two things at once: it uniquely identifies the row AND it prevents the same student from enrolling in the same course twice. The database enforces both rules for free.
Common mistake. Adding a surrogate id SERIAL PRIMARY KEY to the junction table and forgetting to add a UNIQUE(student_id, course_id) constraint. Now duplicate enrollments can sneak in. If your junction table has extra payload columns (like grade) you might genuinely want a surrogate PK — but you still need that unique constraint on the FK pair.
🎯 One-to-One: Use It Sparingly
One user has exactly one profile. One order has exactly one shipping address. A one-to-one relationship means a row in table A corresponds to at most one row in table B. The FK goes on whichever side is the "extension" — the optional or late-arriving data.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email TEXT NOT NULL UNIQUE
);
-- Profile is optional at signup — created later.
-- FK on profiles, not on users.
CREATE TABLE profiles (
user_id INT PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
bio TEXT,
avatar_url TEXT,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Read: user with their profile (LEFT JOIN handles users with no profile yet)
SELECT
u.id,
u.email,
p.bio,
p.avatar_url
FROM users u
LEFT JOIN profiles p ON p.user_id = u.id
WHERE u.id = 5;
Mental model. One-to-one is really just a one-to-many where you add a UNIQUE constraint on the FK. profiles.user_id is both the FK and the PK here — that single declaration enforces uniqueness automatically.
Why use it at all? If it's truly one-to-one, why not put all those columns on users? Sometimes you should. Good reasons to split: the extra columns are optional and mostly NULL (wasted storage), they're owned by a different service, they load on a different code path, or the main table is already wide. Don't split just because it feels cleaner — every JOIN has a cost.
Common mistake. Modelling a one-to-one when you actually need a one-to-many. "A user has one billing address" sounds fine until marketing says "allow multiple saved addresses." If there's any chance cardinality will change, default to a separate table with a plain FK (one-to-many) from day one.
🔒 Foreign Key Constraints & ON DELETE Behavior
A foreign key is a promise: "this value refers to a real row in another table." The ON DELETE clause tells the database what to do when the referenced row is deleted.
-- Three patterns, shown on the same schema
-- 1. CASCADE — delete the child rows automatically
-- Use when: children have no meaning without the parent
CREATE TABLE comments (
id SERIAL PRIMARY KEY,
post_id INT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
body TEXT NOT NULL
);
-- Deleting a post automatically deletes all its comments. ✓
-- 2. RESTRICT (the default) — block the delete if children exist
-- Use when: orphaned children would be a data integrity violation
CREATE TABLE line_items (
id SERIAL PRIMARY KEY,
order_id INT NOT NULL REFERENCES orders(id) ON DELETE RESTRICT,
product_id INT NOT NULL REFERENCES products(id) ON DELETE RESTRICT,
quantity INT NOT NULL CHECK (quantity > 0),
unit_price NUMERIC(10,2) NOT NULL
);
-- Trying to delete an order that has line items will raise an error.
-- You have to explicitly delete line_items first — which is exactly what you want
-- for financial records where accidental cascade could be catastrophic. ✓
-- 3. SET NULL — orphan the child gracefully
-- Use when: the child row still makes sense without the parent
CREATE TABLE tasks (
id SERIAL PRIMARY KEY,
assigned_to INT REFERENCES users(id) ON DELETE SET NULL,
title TEXT NOT NULL
);
-- If a user is deleted, their tasks aren't deleted — they just become unassigned.
-- assigned_to must be NULLable for this to work. ✓
Mental model. Ask yourself: "If the parent row disappears, what should happen to the children?" Delete them (CASCADE), block the parent delete (RESTRICT), or keep them orphaned (SET NULL). The answer should come from the real-world domain, not from what's convenient to code.
Why let the database handle this? Application code can be bypassed — direct SQL queries, migrations, admin scripts, future engineers who don't know the rule. The database constraint fires for every delete, no matter who ran it or why. It's the last line of defense.
Common mistake. Using ON DELETE CASCADE everywhere because it's convenient. Cascading deletes on financial or audit data (payments, invoices, audit_logs) is a silent data-destruction bug waiting to happen. Default to RESTRICT for anything that matters; add CASCADE only where you've consciously decided child rows are trash without their parent.
🛠️ Your Mission
Pick the many-to-many relationship in your current app — the one you've been storing as a JSON array or a comma-separated string in a column.
- Identify both sides. Name them clearly: "A
[thing]can have many[other things], and a[other thing]can belong to many[things]." - Create the junction table. Give it a name that describes the relationship (e.g.,
user_tags,post_categories,recipe_ingredients), not just a concatenation. - Add the composite
PRIMARY KEYon both FK columns. - Pick the right
ON DELETEfor each FK — don't copy-paste; think about each one. - Write the three JOIN queries: all of B for a given A, all of A for a given B, and the full join.
- Drop the old array/string column once the junction table is live and tested.
If you don't have a many-to-many yet, model one that belongs in your app — they're in every non-trivial system.
✅ You're Done When…
- Your Data Modeling Cheat Sheet (from D2-01) has been updated: every relationship is labeled 1:M, M:M, or 1:1, with the FK side noted on each arrow
- Your junction table has a composite
PRIMARY KEY (a_id, b_id)— not a standaloneidwithout aUNIQUEconstraint on the FK pair - Every
ON DELETEclause in your schema is a conscious choice with a one-line comment explaining why (CASCADE, RESTRICT, or SET NULL), not a default you accepted without thinking - You can write all three JOIN patterns from memory: INNER JOIN for required relationships, LEFT JOIN for optional ones, and a three-table JOIN across a junction table
➡️ Next: NoSQL & When to Use It. Build It Right, Or Don't Build It At All. 🏛️