Information Geometry — Fisher-Rao Metric & Natural Gradient
Statistical manifolds, geodesics in distribution space, and Amari's natural gradient learning
Distribution Family
Gaussian
Gamma
Bernoulli
Parameter θ₁ =
0.00
Parameter θ₂ =
1.00
Target θ₁* =
2.00
Target θ₂* =
0.50
Compare Gradient Paths
Metrics
KL(p||q)
—
Fisher distance
—
Fisher metric: g_{ij}(θ) = E[∂_i log p · ∂_j log p]
Natural gradient: θ̃∇ℓ = G(θ)^{-1} ∇ℓ
Invariant under reparametrization
KL divergence: D_KL(p||q) ≈ ½(θ-θ*)ᵀG(θ-θ*)
Geodesics = "straight lines" in stat. space