Action as inference — policy selection via expected free energy minimization
Active inference (Friston 2009) reframes action selection as variational inference: an agent picks policies π that minimize Expected Free Energy G(π) = ambiguity + risk, where ambiguity = −⟨log p(o|s)⟩ (epistemic value, information gain) and risk = KL[q(s|π) ‖ p(s)] (extrinsic value, divergence from preferred outcomes). Precision γ controls the softmax temperature over policies. Unlike reward maximization, active inference unifies exploration (epistemic) and exploitation (pragmatic) within a single objective. The agent (blue dot) navigates toward goals (gold) while resolving uncertainty in the hidden state landscape.