Skip to main content
AI Foundations

⏱ About 15 min15 XP

Layers: Stacking Neurons

One neuron computing a weighted sum is not very impressive on its own. It is basically a fancy decision formula. The real magic of neural networks comes from what happens when you take hundreds or thousands of neurons, arrange them in rows called layers, and connect each layer's outputs to the next layer's inputs. The network stops being a single formula and becomes a system that can discover its own intermediate representations of the world — patterns within patterns within patterns. Understanding layers is understanding why neural networks are so different from earlier computer programs.

The Three Kinds of Layers

Every standard neural network has at least three distinct kinds of layers, each with a specific role. The input layer is where raw data enters the network. It is not really a layer of computing neurons — each node simply represents one feature of the input. If you are feeding the network a 28×28 grayscale image, the input layer has 784 nodes, one per pixel, each holding a number from 0 (black) to 1 (white). The input layer does no math; it is just the door. The output layer is where the network produces its answer. For a network that classifies handwritten digits, the output layer has 10 neurons — one for each digit from 0 to 9. The neuron with the highest output value 'wins,' and that digit becomes the prediction. For a network that estimates a house price, the output layer might have just one neuron producing a single number. The hidden layers are everything in between. They are called 'hidden' not because they are secret, but because they are not directly visible in the input data or the final output. Hidden layers do the real computational work: they transform the raw inputs into increasingly useful intermediate representations until the final layer can make a clean decision.

What Hidden Layers Actually Do

Each hidden layer learns to detect features in its input. Early hidden layers in an image network might detect edges and corners. Later layers detect shapes and textures. The final hidden layer might detect high-level concepts like 'snout' or 'fur pattern.' This hierarchy of features is what gives deep networks their power — they build understanding bottom-up.

Here is a concrete example. Imagine training a network to recognize whether a photo contains a cat. Layer 1 (input): 784 numbers, each representing a pixel brightness. Hidden Layer 1: 128 neurons. After training, many of these neurons respond strongly to edges — places where pixel brightness changes sharply. They do not 'know' they are looking for edges; they learned this because it was useful. Hidden Layer 2: 64 neurons. These neurons combine the edge detectors from the previous layer to respond to curves, corners, and simple shapes — like an eye outline or a triangular ear. Hidden Layer 3: 32 neurons. These combine shapes into higher-level patterns — something that looks like a face, something that looks like fur texture. Output layer: 2 neurons — one for 'cat,' one for 'not cat.' The network outputs the one with the higher value. No human programmed any of these intermediate features. The network discovered them because they were the most useful way to get from raw pixels to the right answer.

Connections Between Layers

In a fully connected layer — also called a dense layer — every neuron in one layer connects to every neuron in the next. If you have 128 neurons in layer 1 and 64 in layer 2, that is 128 × 64 = 8,192 connections, each with its own weight. Even a moderately sized network can have millions of weights. This is why training neural networks requires so much computation. Not all networks are fully connected. Convolutional neural networks, used for images, use a smarter connection pattern where neurons only connect to nearby regions — mimicking the way your eye's receptive fields work. But the layered, feedforward structure — data flowing from input to output through intermediate layers — is common to almost all standard neural networks.

Match each layer type to its role in the network.

Terms

Input layer
Hidden layer
Output layer
Dense layer
Feature

Definitions

Holds the raw numerical features of the data; does no computation
Produces the network's final answer or prediction
A layer where every neuron connects to every neuron in the next layer
A pattern or property detected by a neuron that helps with the final prediction
Transforms data into increasingly useful intermediate representations

Drag terms onto their definitions, or click a term then click a definition to match.

More Layers, More Complexity — With a Cost

Adding hidden layers lets a network learn more complex patterns — but it also means more weights to train, more computation per forward pass, and more data needed to train well. Designing a network is partly a process of choosing how many layers and how many neurons per layer. Too few, and the network cannot capture the pattern. Too many, and it may memorize the training data instead of learning to generalize.

Why are the middle layers of a neural network called 'hidden' layers?

In an image recognition network, what kinds of features do early hidden layers tend to learn?

Design a Tiny Network on Paper

  1. You are building a neural network to predict whether a student will enjoy a book. You have four input features: genre match (does the genre fit their history?), page count, reading level match, and topic interest. Use a 0-to-1 scale for all four.
  2. On paper, draw a simple network:
  3. - Input layer: 4 nodes, labeled with the four features
  4. - Hidden layer 1: draw 4 neurons
  5. - Hidden layer 2: draw 3 neurons
  6. - Output layer: 2 neurons (Likely to enjoy / Unlikely to enjoy)
  7. Draw lines connecting every neuron in each layer to every neuron in the next layer.
  8. Count the total number of connections (weights) in your network.
  9. Which connections do you think should end up with large positive weights after training? Why? Write one sentence per hidden layer explaining what kind of 'feature' you imagine those neurons might learn to detect.