Information Bottleneck

Compression vs. relevance · rate-distortion theory
Source Distribution
Classes (Y)3
Cluster separation50
Noise σ20
Bottleneck
β (tradeoff)β=0.50
Code words |T|4
I(X;T) = —
I(T;Y) = —
Efficiency = —
The Information Bottleneck (Tishby et al. 1999) formalizes the tradeoff between compressing X into T while preserving information about Y. The IB curve traces achievable (I(X;T), I(T;Y)) pairs — the upper-left boundary. The parameter β controls the tradeoff: small β favors compression (fewer bits about X), large β allows more complexity to capture more of Y.