Skip to main content
Backend & APIs
⚙️ Backend & APIsLesson 6 of 13

Talking to the Database (the Data-Access Layer)

The data-access layer: query safely, avoid leaks, and keep DB code out of your routes.

Talking to the Database (the Data-Access Layer)

Stage 3 · Backend & APIs · B.U.I.L.D. letter: L

How you read and write data is the single biggest factor separating a production app from a hackathon demo — get this wrong and you hand attackers a master key to every row you've ever stored.


⚠️ The vibe trap

When you ask an AI to "add a login endpoint," it will almost always concatenate user input directly into a SQL string and then return the entire database row — password hash, internal flags, and all — straight back to the caller. That's two catastrophic bugs in one endpoint: the first lets attackers dump your entire database with a single request; the second hands them the tools to crack every password you've ever stored. Neither shows up in a browser preview, which is exactly why this layer needs a human engineer watching it.


🔒 Parameterized Queries — The One Rule You Never Break

SQL injection is not a subtle exploit. If you build a query by gluing strings together, an attacker can terminate your query early and append anything they want — DROP TABLE users, SELECT * FROM users, or worse. Parameterized queries close this permanently: you write a query template with placeholders, and the database driver treats the values as data, never as SQL syntax.

Wrong (never do this):

// DO NOT WRITE THIS
const row = await db.query(
  `SELECT * FROM users WHERE email = '${req.body.email}'`
);

If req.body.email is ' OR '1'='1, that query returns every user in your table.

Right (always do this):

// node-postgres (pg) — positional placeholders
const { rows } = await db.query(
  'SELECT id, email, display_name FROM users WHERE email = $1',
  [req.body.email]
);

// mysql2 — question-mark placeholders
const [rows] = await db.execute(
  'SELECT id, email, display_name FROM users WHERE email = ?',
  [req.body.email]
);

The driver serializes the value safely regardless of what it contains. Your query template never changes shape.

Mental model: Your SQL is a stencil. Parameterized values fill holes in the stencil — they can never rewrite the stencil.

Why it matters: OWASP has listed SQL injection as a top-ten vulnerability for over a decade. Automated scanners find it in minutes.

Common mistake: Using template literals because they "look cleaner." They look clean right up until someone types '; DROP TABLE orders; -- into your search box.


🏛️ The Repository Pattern — Keep SQL Out of Your Controllers

A controller's job is to receive a request, call business logic, and send a response. It should not know whether your data lives in Postgres, MongoDB, SQLite, or a flat file. When SQL leaks into controllers, you end up copy-pasting the same query in four places, and every schema change breaks everything.

The fix is a repository (also called a data-access object or DAO): a plain module that owns all queries for one domain concept. Your controller imports the repository and calls named functions. The repository is the only code that ever touches the database.

// src/repositories/userRepository.js

import { pool } from '../db/pool.js';

export async function findUserByEmail(email) {
  const { rows } = await pool.query(
    'SELECT id, email, display_name, password_hash FROM users WHERE email = $1',
    [email]
  );
  return rows[0] ?? null; // null, not undefined — explicit not-found signal
}

export async function findUserById(id) {
  const { rows } = await pool.query(
    'SELECT id, email, display_name, created_at FROM users WHERE id = $1',
    [id]
  );
  return rows[0] ?? null;
}

export async function createUser({ email, displayName, passwordHash }) {
  const { rows } = await pool.query(
    `INSERT INTO users (email, display_name, password_hash)
     VALUES ($1, $2, $3)
     RETURNING id, email, display_name, created_at`,
    [email, displayName, passwordHash]
  );
  return rows[0];
}

Now your controller looks like this:

// src/controllers/authController.js

import { findUserByEmail } from '../repositories/userRepository.js';
import { toPublicUser } from '../dto/userDto.js';

export async function getProfile(req, res) {
  const user = await findUserByEmail(req.params.email);
  if (!user) return res.status(404).json({ error: 'User not found' });
  res.json(toPublicUser(user));
}

No SQL. No database import. No raw rows.

Mental model: The repository is the only thing allowed to know how data is stored. Everything above it thinks in business objects, not tables and columns.

Why it matters: When you migrate from Postgres to Supabase, or add a caching layer, or rename a column, you change one file — not thirty controllers.

Common mistake: Writing "just this one query" inline in a controller because it's faster. Six weeks later you have twelve inline queries and no single place to enforce field whitelisting.


🎭 Row-to-DTO Mapping — Never Return Raw Database Rows

A raw database row contains everything the database knows about a record: password_hash, reset_token, internal_status_flag, stripe_customer_id, last_ip_address. None of that should ever reach a client. Not accidentally, not "just this once."

A DTO (Data Transfer Object) is a plain function that takes a raw row and returns only the fields a client is allowed to see. It also lets you rename database snake_case to camelCase and reshape nested data.

// src/dto/userDto.js

/**
 * Strips sensitive fields and normalizes naming for client responses.
 * NEVER return a raw user row from any endpoint.
 */
export function toPublicUser(row) {
  if (!row) return null;
  return {
    id: row.id,
    email: row.email,
    displayName: row.display_name,  // snake_case → camelCase
    createdAt: row.created_at,
    // password_hash: intentionally omitted
    // reset_token: intentionally omitted
    // internal_flags: intentionally omitted
  };
}

/**
 * Even slimmer shape for lists — only what a list item needs.
 */
export function toUserSummary(row) {
  return {
    id: row.id,
    displayName: row.display_name,
  };
}

Use it wherever you return user data:

// In a controller — always map before responding
const user = await findUserById(req.user.id);
if (!user) return res.status(404).json({ error: 'Not found' });
res.json(toPublicUser(user));

// In a list endpoint — map the whole array
const users = await listActiveUsers();
res.json(users.map(toUserSummary));

Mental model: The DTO is a border control checkpoint. Raw rows stay inside the server. Shaped objects cross to the client.

Why it matters: A single res.json(user) on a raw row has leaked real users' password hashes in production incidents at companies large enough to know better. One mapper function eliminates the entire class of risk.

Common mistake: Deleting fields by mutation (delete user.passwordHash) instead of building a new object. Mutation is fragile and easy to forget; explicit construction is explicit about what is shared.


🌊 Connection Pooling — Don't Open a New Connection Per Request

Opening a database connection is expensive: TCP handshake, authentication, SSL negotiation. On a busy API, doing that for every request will exhaust your database's connection limit within seconds. Connection pooling keeps a set of open connections ready and lends them out.

You configure the pool once at startup and import it everywhere:

// src/db/pool.js  — create this file once, import it everywhere

import pg from 'pg';

const { Pool } = pg;

export const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 10,              // maximum open connections
  idleTimeoutMillis: 30_000,   // close idle connections after 30 s
  connectionTimeoutMillis: 2_000, // fail fast if pool is exhausted
});

// Optional: log pool errors so they appear in your monitoring
pool.on('error', (err) => {
  console.error('Unexpected pool error', err);
});

Then every repository file imports from here — never calls new Pool() or new Client() again:

// src/repositories/orderRepository.js

import { pool } from '../db/pool.js';

export async function findOrdersByUserId(userId) {
  const { rows } = await pool.query(
    `SELECT id, total_cents, status, created_at
     FROM orders
     WHERE user_id = $1
     ORDER BY created_at DESC`,
    [userId]
  );
  return rows; // still raw rows — map them in the controller or service
}

Mental model: The pool is a shared fleet of cars. Requests borrow one, drive it, return it. Nobody buys a new car for every trip.

Why it matters: Without pooling, each request opens and closes a connection. Under load — 50 concurrent requests on a Postgres instance with a limit of 100 — you'll hit too many connections errors within minutes. With a pool of 10, those same 50 requests queue and succeed.

Common mistake: Creating the pool inside a request handler or repository function. The pool must be a module-level singleton. Creating it inside a function means a new pool (and new connections) on every call.


🚨 Handling Not-Found Gracefully

When a repository returns null, the right response is 404 Not Found — not a crash, not a 500, and not an empty 200. Establish this pattern consistently so callers always know what to expect.

// src/repositories/postRepository.js

export async function findPostById(id) {
  const { rows } = await pool.query(
    'SELECT id, title, body, author_id, published_at FROM posts WHERE id = $1',
    [id]
  );
  return rows[0] ?? null; // explicit null — not undefined, not an empty array
}
// src/controllers/postController.js

import { findPostById } from '../repositories/postRepository.js';
import { toPublicPost } from '../dto/postDto.js';

export async function getPost(req, res) {
  const post = await findPostById(req.params.id);

  if (post === null) {
    return res.status(404).json({ error: 'Post not found' });
  }

  res.json(toPublicPost(post));
}

Keep the not-found check before any further logic. If you pass a null into your DTO mapper or business logic, you'll get a confusing TypeError: Cannot read properties of null in your logs instead of a clean 404 in your tests.

Mental model: null from the repository means "this ID does not exist in the database." Your controller turns that fact into an HTTP answer.

Common mistake: Checking if (!post) in one place but returning {} or [] in another. Be consistent: repositories return null for single-row lookups, empty arrays for list lookups, and callers pattern-match on those contracts.


🛠️ Your Mission

Open your existing app and find every place where your backend reads from or writes to the database. Your task:

  1. Wrap all queries in repository functions. Create one file per domain concept (userRepository.js, orderRepository.js, etc.). Move every inline query into the appropriate repository, with a parameterized placeholder for every piece of user-supplied data.

  2. Create a DTO mapper for every domain object you return to clients. Start with users — that's where the leaks hurt most. Enumerate every field your client actually needs and build a function that returns exactly those fields.

  3. Verify your pool is a singleton. Trace the import path from any repository back to a single new Pool(...) call at module load time. If you find more than one, consolidate them.

  4. Add not-found guards in every controller that fetches a single row. Return 404 when the repository returns null.


✅ You're done when…

  • Every SQL query uses placeholders ($1, ?) — no string concatenation or template literals anywhere in data-access code. Run a codebase search for backtick SQL strings as a final check (Production-Readiness Checklist: "parameterized queries enforced").
  • Your controllers import only repository functions — no pool, db, or ORM model imports appear in controller files. A grep for pool.query in your controllers directory should return zero results.
  • Every endpoint that returns user data calls a DTO mapper before res.json(). Confirm by checking that password_hash, reset_token, and any internal flags do not appear in any API response body (Security Audit checklist: "no sensitive fields in API responses").

➡️ Next: Authentication at the API Layer. Build It Right, Or Don't Build It At All. 🏛️

Always-on rigor toolkit

🏛️ Build It Right, Or Don't Build It At All.