Compare polynomial models of different degrees via Bayesian evidence. The evidence naturally penalizes complexity (Occam's razor). Watch how posterior model probabilities shift as data arrives.
Bayesian model selection: P(M_k|D) ∝ P(D|M_k)·P(M_k). Evidence P(D|M_k) = ∫P(D|θ,M_k)P(θ|M_k)dθ. For linear models: analytic via Laplace approximation. BIC approximation: log P(D|M) ≈ log P(D|θ̂) - (k/2)log N. Occam's razor: evidence prefers simpler models unless data demand complexity. AIC vs BIC vs DIC: different regularization of complexity. Used for feature selection, hypothesis testing, neural architecture.