Unrolled RNN

Recurrent neural network unrolled through time — hidden state memory and vanishing gradient.

hₜ = tanh(Wxₜ + Uhₜ₋₁ + b)

Timesteps: 6 Hidden size: 4 Sequence: Show:

An RNN applies the same computation hₜ = tanh(W·xₜ + U·hₜ₋₁ + b) at each timestep, accumulating memory in the hidden state. Backpropagation through time multiplies Jacobians across T steps — when the spectral radius of U is <1, gradients vanish exponentially, making long-range dependencies hard to learn. LSTMs solve this with gated memory cells that can preserve gradients over hundreds of timesteps.