Talking to the Database (the Data-Access Layer)
Stage 3 · Backend & APIs · B.U.I.L.D. letter: L
How you read and write data is the single biggest factor separating a production app from a hackathon demo — get this wrong and you hand attackers a master key to every row you've ever stored.
⚠️ The vibe trap
When you ask an AI to "add a login endpoint," it will almost always concatenate user input directly into a SQL string and then return the entire database row — password hash, internal flags, and all — straight back to the caller. That's two catastrophic bugs in one endpoint: the first lets attackers dump your entire database with a single request; the second hands them the tools to crack every password you've ever stored. Neither shows up in a browser preview, which is exactly why this layer needs a human engineer watching it.
🔒 Parameterized Queries — The One Rule You Never Break
SQL injection is not a subtle exploit. If you build a query by gluing strings together, an attacker can terminate your query early and append anything they want — DROP TABLE users, SELECT * FROM users, or worse. Parameterized queries close this permanently: you write a query template with placeholders, and the database driver treats the values as data, never as SQL syntax.
Wrong (never do this):
// DO NOT WRITE THIS
const row = await db.query(
`SELECT * FROM users WHERE email = '${req.body.email}'`
);
If req.body.email is ' OR '1'='1, that query returns every user in your table.
Right (always do this):
// node-postgres (pg) — positional placeholders
const { rows } = await db.query(
'SELECT id, email, display_name FROM users WHERE email = $1',
[req.body.email]
);
// mysql2 — question-mark placeholders
const [rows] = await db.execute(
'SELECT id, email, display_name FROM users WHERE email = ?',
[req.body.email]
);
The driver serializes the value safely regardless of what it contains. Your query template never changes shape.
Mental model: Your SQL is a stencil. Parameterized values fill holes in the stencil — they can never rewrite the stencil.
Why it matters: OWASP has listed SQL injection as a top-ten vulnerability for over a decade. Automated scanners find it in minutes.
Common mistake: Using template literals because they "look cleaner." They look clean right up until someone types '; DROP TABLE orders; -- into your search box.
🏛️ The Repository Pattern — Keep SQL Out of Your Controllers
A controller's job is to receive a request, call business logic, and send a response. It should not know whether your data lives in Postgres, MongoDB, SQLite, or a flat file. When SQL leaks into controllers, you end up copy-pasting the same query in four places, and every schema change breaks everything.
The fix is a repository (also called a data-access object or DAO): a plain module that owns all queries for one domain concept. Your controller imports the repository and calls named functions. The repository is the only code that ever touches the database.
// src/repositories/userRepository.js
import { pool } from '../db/pool.js';
export async function findUserByEmail(email) {
const { rows } = await pool.query(
'SELECT id, email, display_name, password_hash FROM users WHERE email = $1',
[email]
);
return rows[0] ?? null; // null, not undefined — explicit not-found signal
}
export async function findUserById(id) {
const { rows } = await pool.query(
'SELECT id, email, display_name, created_at FROM users WHERE id = $1',
[id]
);
return rows[0] ?? null;
}
export async function createUser({ email, displayName, passwordHash }) {
const { rows } = await pool.query(
`INSERT INTO users (email, display_name, password_hash)
VALUES ($1, $2, $3)
RETURNING id, email, display_name, created_at`,
[email, displayName, passwordHash]
);
return rows[0];
}
Now your controller looks like this:
// src/controllers/authController.js
import { findUserByEmail } from '../repositories/userRepository.js';
import { toPublicUser } from '../dto/userDto.js';
export async function getProfile(req, res) {
const user = await findUserByEmail(req.params.email);
if (!user) return res.status(404).json({ error: 'User not found' });
res.json(toPublicUser(user));
}
No SQL. No database import. No raw rows.
Mental model: The repository is the only thing allowed to know how data is stored. Everything above it thinks in business objects, not tables and columns.
Why it matters: When you migrate from Postgres to Supabase, or add a caching layer, or rename a column, you change one file — not thirty controllers.
Common mistake: Writing "just this one query" inline in a controller because it's faster. Six weeks later you have twelve inline queries and no single place to enforce field whitelisting.
🎭 Row-to-DTO Mapping — Never Return Raw Database Rows
A raw database row contains everything the database knows about a record: password_hash, reset_token, internal_status_flag, stripe_customer_id, last_ip_address. None of that should ever reach a client. Not accidentally, not "just this once."
A DTO (Data Transfer Object) is a plain function that takes a raw row and returns only the fields a client is allowed to see. It also lets you rename database snake_case to camelCase and reshape nested data.
// src/dto/userDto.js
/**
* Strips sensitive fields and normalizes naming for client responses.
* NEVER return a raw user row from any endpoint.
*/
export function toPublicUser(row) {
if (!row) return null;
return {
id: row.id,
email: row.email,
displayName: row.display_name, // snake_case → camelCase
createdAt: row.created_at,
// password_hash: intentionally omitted
// reset_token: intentionally omitted
// internal_flags: intentionally omitted
};
}
/**
* Even slimmer shape for lists — only what a list item needs.
*/
export function toUserSummary(row) {
return {
id: row.id,
displayName: row.display_name,
};
}
Use it wherever you return user data:
// In a controller — always map before responding
const user = await findUserById(req.user.id);
if (!user) return res.status(404).json({ error: 'Not found' });
res.json(toPublicUser(user));
// In a list endpoint — map the whole array
const users = await listActiveUsers();
res.json(users.map(toUserSummary));
Mental model: The DTO is a border control checkpoint. Raw rows stay inside the server. Shaped objects cross to the client.
Why it matters: A single res.json(user) on a raw row has leaked real users' password hashes in production incidents at companies large enough to know better. One mapper function eliminates the entire class of risk.
Common mistake: Deleting fields by mutation (delete user.passwordHash) instead of building a new object. Mutation is fragile and easy to forget; explicit construction is explicit about what is shared.
🌊 Connection Pooling — Don't Open a New Connection Per Request
Opening a database connection is expensive: TCP handshake, authentication, SSL negotiation. On a busy API, doing that for every request will exhaust your database's connection limit within seconds. Connection pooling keeps a set of open connections ready and lends them out.
You configure the pool once at startup and import it everywhere:
// src/db/pool.js — create this file once, import it everywhere
import pg from 'pg';
const { Pool } = pg;
export const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 10, // maximum open connections
idleTimeoutMillis: 30_000, // close idle connections after 30 s
connectionTimeoutMillis: 2_000, // fail fast if pool is exhausted
});
// Optional: log pool errors so they appear in your monitoring
pool.on('error', (err) => {
console.error('Unexpected pool error', err);
});
Then every repository file imports from here — never calls new Pool() or new Client() again:
// src/repositories/orderRepository.js
import { pool } from '../db/pool.js';
export async function findOrdersByUserId(userId) {
const { rows } = await pool.query(
`SELECT id, total_cents, status, created_at
FROM orders
WHERE user_id = $1
ORDER BY created_at DESC`,
[userId]
);
return rows; // still raw rows — map them in the controller or service
}
Mental model: The pool is a shared fleet of cars. Requests borrow one, drive it, return it. Nobody buys a new car for every trip.
Why it matters: Without pooling, each request opens and closes a connection. Under load — 50 concurrent requests on a Postgres instance with a limit of 100 — you'll hit too many connections errors within minutes. With a pool of 10, those same 50 requests queue and succeed.
Common mistake: Creating the pool inside a request handler or repository function. The pool must be a module-level singleton. Creating it inside a function means a new pool (and new connections) on every call.
🚨 Handling Not-Found Gracefully
When a repository returns null, the right response is 404 Not Found — not a crash, not a 500, and not an empty 200. Establish this pattern consistently so callers always know what to expect.
// src/repositories/postRepository.js
export async function findPostById(id) {
const { rows } = await pool.query(
'SELECT id, title, body, author_id, published_at FROM posts WHERE id = $1',
[id]
);
return rows[0] ?? null; // explicit null — not undefined, not an empty array
}
// src/controllers/postController.js
import { findPostById } from '../repositories/postRepository.js';
import { toPublicPost } from '../dto/postDto.js';
export async function getPost(req, res) {
const post = await findPostById(req.params.id);
if (post === null) {
return res.status(404).json({ error: 'Post not found' });
}
res.json(toPublicPost(post));
}
Keep the not-found check before any further logic. If you pass a null into your DTO mapper or business logic, you'll get a confusing TypeError: Cannot read properties of null in your logs instead of a clean 404 in your tests.
Mental model: null from the repository means "this ID does not exist in the database." Your controller turns that fact into an HTTP answer.
Common mistake: Checking if (!post) in one place but returning {} or [] in another. Be consistent: repositories return null for single-row lookups, empty arrays for list lookups, and callers pattern-match on those contracts.
🛠️ Your Mission
Open your existing app and find every place where your backend reads from or writes to the database. Your task:
-
Wrap all queries in repository functions. Create one file per domain concept (
userRepository.js,orderRepository.js, etc.). Move every inline query into the appropriate repository, with a parameterized placeholder for every piece of user-supplied data. -
Create a DTO mapper for every domain object you return to clients. Start with
users— that's where the leaks hurt most. Enumerate every field your client actually needs and build a function that returns exactly those fields. -
Verify your pool is a singleton. Trace the import path from any repository back to a single
new Pool(...)call at module load time. If you find more than one, consolidate them. -
Add not-found guards in every controller that fetches a single row. Return
404when the repository returnsnull.
✅ You're done when…
- Every SQL query uses placeholders (
$1,?) — no string concatenation or template literals anywhere in data-access code. Run a codebase search for backtick SQL strings as a final check (Production-Readiness Checklist: "parameterized queries enforced"). - Your controllers import only repository functions — no
pool,db, or ORM model imports appear in controller files. A grep forpool.queryin your controllers directory should return zero results. - Every endpoint that returns user data calls a DTO mapper before
res.json(). Confirm by checking thatpassword_hash,reset_token, and any internal flags do not appear in any API response body (Security Audit checklist: "no sensitive fields in API responses").
➡️ Next: Authentication at the API Layer. Build It Right, Or Don't Build It At All. 🏛️