Is HYVE CARES really free?

Yes. 100% free, forever. Every feature, every lab, every lesson. The only paid add-on is the optional Homeschool Compliance Program ($10/month) for families who need legal compliance tools.

Can I use HYVE CARES for homeschooling?

Yes. HYVE CARES provides a complete K-12 curriculum plus a dedicated Homeschool Compliance Program with attendance tracking, immunization records, standardized test management, and transcript generation — available in all 50 US states.

What subjects does HYVE CARES cover?

200+ subjects including Math, Science, Language Arts, Social Studies, Coding, 18 world languages, Financial Literacy, Music, Art, Career Readiness, and more — aligned with Common Core and NGSS standards.

Does HYVE CARES have practice exams?

Yes. 30+ practice exams including SAT, ACT, GRE, LSAT, MCAT, ASVAB, CompTIA A+, Real Estate, CDL, and more — with timed testing, AI-powered scoring, percentile estimates, and spaced repetition study mode.

MaXXiE is HYVE CARES' AI tutoring system — a personalized learning companion that adapts to each student, generates lessons on demand, scans homework, and provides voice-based learning.

Is HYVE CARES safe for children?

Yes. HYVE CARES requires parental consent for children under 13 (in line with COPPA), stores student data with Row-Level Security and AES-256 encryption at rest, and never sells data or shows ads.

The Forward Pass, Formally

You now know what a single neuron computes. A full neural network is thousands of neurons organized into layers, each layer passing its outputs as inputs to the next. The forward pass is the complete computation from network input to network output — a structured cascade of matrix multiplications and activation functions. In this lesson we trace a full numerical example through a small network and write down the general matrix equations.

Network Architecture and Notation

Define a fully connected (dense) network with: - Input layer: 3 features (x is a 3-dimensional vector) - Hidden layer 1: 2 neurons - Hidden layer 2: 2 neurons - Output layer: 1 neuron (binary classification) For layer l, define: W^(l) = weight matrix (rows = neurons in layer l, columns = neurons in layer l-1) b^(l) = bias vector (one entry per neuron in layer l) h^(l) = activation vector (output of layer l) h^(0) = x = input vector The forward pass for layer l: z^(l) = W^(l) h^(l-1) + b^(l) h^(l) = sigma(z^(l)) [applied element-wise] Note: W^(1) has shape 2x3 (2 hidden neurons, 3 inputs). W^(2) has shape 2x2. W^(3) has shape 1x2.

Matrix View

Writing one layer as z = W h^(prev) + b computes all neurons in that layer simultaneously as a single matrix-vector product. This is why neural networks can be implemented efficiently on GPUs — the entire forward pass is a sequence of dense matrix operations, which modern hardware is optimized to perform in parallel.

Concrete example. Set x = [1.0, 0.5, -1.0]^T. Layer 1 weights and biases: W^(1) = [[0.2, 0.8, -0.5], [0.6, -0.3, 1.0]] b^(1) = [0.1, -0.2]^T Compute z^(1): z^(1)[1] = (0.2)(1.0) + (0.8)(0.5) + (-0.5)(-1.0) + 0.1 = 0.2 + 0.4 + 0.5 + 0.1 = 1.2 z^(1)[2] = (0.6)(1.0) + (-0.3)(0.5) + (1.0)(-1.0) + (-0.2) = 0.6 - 0.15 - 1.0 - 0.2 = -0.75 Apply ReLU: h^(1) = [max(0,1.2), max(0,-0.75)] = [1.2, 0.0] Layer 2 weights and biases: W^(2) = [[0.5, -1.0], [-0.3, 0.7]] b^(2) = [0.0, 0.5]^T Compute z^(2): z^(2)[1] = (0.5)(1.2) + (-1.0)(0.0) + 0.0 = 0.6 z^(2)[2] = (-0.3)(1.2) + (0.7)(0.0) + 0.5 = -0.36 + 0.5 = 0.14 Apply ReLU: h^(2) = [0.6, 0.14] Output layer weights and biases: W^(3) = [[1.2, -0.8]] b^(3) = [0.1] Compute z^(3): z^(3) = (1.2)(0.6) + (-0.8)(0.14) + 0.1 = 0.72 - 0.112 + 0.1 = 0.708 Apply sigmoid: h^(3) = 1/(1+e^(-0.708)) ≈ 1/(1+0.493) ≈ 0.670 The network predicts probability 0.670 that this example belongs to the positive class.

Parameters and Computation

How many total parameters does our small network have? Layer 1: W^(1) is 2x3 = 6 weights; b^(1) is 2 values. Total: 8. Layer 2: W^(2) is 2x2 = 4 weights; b^(2) is 2 values. Total: 6. Layer 3: W^(3) is 1x2 = 2 weights; b^(3) is 1 value. Total: 3. Grand total: 17 parameters. A real network like GPT-2 small has 117 million parameters. A modern large language model may have hundreds of billions. The forward pass structure is identical — it is just W^(l) h^(l-1) + b^(l) repeated, with different layer sizes. The conceptual machinery you learned here scales directly.

Match each symbol to what it represents in a layer's forward pass.

Terms

W^(l)

b^(l)

z^(l)

h^(l)

h^(0)

Definitions

Activation vector; element-wise application of sigma to z^(l)

The raw input to the network; the first layer's incoming signal

Bias vector for layer l; one scalar per neuron

Pre-activation vector; result of W^(l) h^(l-1) + b^(l) before nonlinearity

Weight matrix for layer l; shape is (neurons in l) x (neurons in l-1)

Drag terms onto their definitions, or click a term then click a definition to match.

Check Your Dimensions

Dimension errors are the most common bug when implementing a forward pass. Before writing any code, work out the shapes: if layer l has n neurons and layer l-1 has m neurons, then W^(l) is (n x m), h^(l-1) is (m x 1), z^(l) = W^(l) h^(l-1) + b^(l) is (n x 1). Track shapes at every step.

In the worked example, h^(1)[2] = 0.0 after ReLU. What caused this and what does it imply for layer 2's computation?

If you doubled the number of neurons in hidden layer 1 from 2 to 4, what would be the new shape of W^(1) and W^(2)?

Trace Your Own Forward Pass

Step 1: Design a tiny network: 2 inputs, 3 neurons in one hidden layer, 1 output neuron.
Step 2: Choose small weight values and biases (use numbers between -1 and 1 for ease).
Step 3: Choose an input vector x = [x1, x2].
Step 4: Compute z^(1) (a 3-dimensional vector) by hand, showing each dot product.
Step 5: Apply ReLU element-wise to get h^(1).
Step 6: Compute z^(2) and apply sigmoid to get the output probability.
Step 7: Interpret your result: what probability does the network assign to this input? What would make it predict 'yes' (probability > 0.5)?