Backpropagation

Forward pass → loss → backward pass → gradient descent. Watch weights update in real time.

Input / Target

Gradients & Weights

Backpropagation applies the chain rule of calculus to efficiently compute how much each weight contributed to the error. The gradient flows backward from the loss, through each layer, multiplying local derivatives — turning millions of parameters into a single update direction.

Backpropagation (Rumelhart, Hinton & Williams 1986) applies the chain rule layer by layer: each node receives a gradient from its outputs and multiplies by its local derivative to pass upstream. Stochastic gradient descent on these gradients is the engine behind modern deep learning.