LOSS LANDSCAPE

0.050
0.90
0.999
0.020
Optimizers:
SGD
Momentum
Adam
SGD Loss:
Mom Loss:
Adam Loss:
Steps: 0
SGD: θ ← θ − η∇L | Momentum: v ← βv + ∇L, θ ← θ − ηv
Adam: m ← β₁m + (1−β₁)g, v ← β₂v + (1−β₂)g², θ ← θ − η·m̂/√v̂
Color map: blue=low loss → red=high loss (log scale)