Recurrent neural network unrolled through time — hidden state memory and vanishing gradient.
An RNN applies the same computation hₜ = tanh(W·xₜ + U·hₜ₋₁ + b) at each timestep, accumulating memory in the hidden state. Backpropagation through time multiplies Jacobians across T steps — when the spectral radius of U is <1, gradients vanish exponentially, making long-range dependencies hard to learn. LSTMs solve this with gated memory cells that can preserve gradients over hundreds of timesteps.