SGD — 2D Loss Landscape Exploration

Stochastic gradient descent navigating a parameterized loss surface
θ_{t+1} = θ_t − η·∇L̃(θ_t)  |  L̃ = noisy minibatch estimate  |  Escape rate ∝ e^{−ΔL/η}
0.01
0.10
0.00
Click landscape to set start point
Loss
Step
||grad||
Position (x,y)