Left: Value function V(x) | Right: Optimal control u*(x)
Hamilton-Jacobi-Bellman equation: For a controlled SDE dx = f(x,u)dt + σdW, the optimal value function V(x) satisfies:
0 = min_u [Q·x² + R·u² + f(x,u)·V' + ½σ²·V'' − γ⁻¹·V].
Minimizing over u (for quadratic control cost): u* = −f_u·V'/(2R), giving the feedback law.
For LQR: V(x) = P·x², P satisfies the algebraic Riccati equation: 2aP − P²/R + Q + σ²P = 0.
Value iteration: V_{n+1}(x) = min_u [Q·x²dt + R·u²dt + γ·V_n(x + f·dt + σ·√dt·ξ)].