Step: 0
Loss: —
Position: —
About: The loss landscape is a mixture of Gaussian "valleys" in 2D parameter space. Three optimizers are compared: vanilla SGD adds random gradient noise (stochastic batches), Momentum accumulates velocity to escape shallow local minima, and Adam uses adaptive per-parameter learning rates. The lower panel shows loss vs. training step.