Tishby-Pereira-Bialek (1999) — Compression vs Relevance Trade-off
IB Parameters
IB Equations
min I(X;T) − β·I(T;Y)
p(t|x) ∝ exp(−β·D_KL[p(y|x)‖p(y|t)])
p(y|t) = Σₓ p(y|x)p(x|t)
Mutual information values and efficiency at current β.
Theory
The IB method finds the optimal trade-off between compression I(X;T) and relevance I(T;Y). The IB curve bounds all achievable (compression, relevance) pairs. β controls the trade-off: β→0 maximally compresses, β→∞ preserves all relevant information. The curve is convex and its slope equals β at each point.