Information Geometry — Fisher-Rao Metric & Natural Gradient

Statistical manifolds, geodesics in distribution space, and Amari's natural gradient learning

Distribution Family

Gaussian Gamma Bernoulli

Metrics

KL(p||q)
Fisher distance
Fisher metric: g_{ij}(θ) = E[∂_i log p · ∂_j log p]
Natural gradient: θ̃∇ℓ = G(θ)^{-1} ∇ℓ
Invariant under reparametrization
KL divergence: D_KL(p||q) ≈ ½(θ-θ*)ᵀG(θ-θ*)
Geodesics = "straight lines" in stat. space