Stochastic Gradient Descent

Visualize how SGD, Momentum, RMSProp, and Adam navigate a 2D loss landscape. Click anywhere to set the starting position. Observe how adaptive methods escape saddle points and navigate ravines.

SGD
Momentum
RMSProp
Adam

Hyperparameters

Learning rate η0.010
Momentum β0.90
Noise σ0.10
Speed3

Current Loss