Visualize constrained optimization: maximize/minimize f(x,y) on a constraint g(x,y)=c. Level curves of f touch the constraint curve tangentially at optima. Gradient descent walks the constraint.
Lagrange condition: ∇f = λ∇g at the optimum (gradients parallel). The multiplier λ measures how much f would improve if the constraint were relaxed. Foundation of KKT conditions, economics (shadow prices), maximum entropy, and neural network training.