Information Bottleneck
min I(T;X) subject to I(T;Y) ≥ constraint — compression that preserves relevance
Joint distribution p(X,Y)
IB tradeoff curve I(T;X) vs I(T;Y)
The Information Bottleneck (Tishby et al. 1999): find T=f(X) that compresses X while retaining info about Y.
Lagrangian: min I(T;X) − β·I(T;Y). Blahut-Arimoto iteration converges to the IB curve.
β→0: maximum compression (T constant). β→∞: lossless (T=X).