Information Bottleneck

The Information Bottleneck (Tishby, Pereira, Bialek 1999): given source X and relevance Y, find a compressed representation T that maximizes I(T;Y) while minimizing I(X;T). The optimal tradeoff curve traces the IB bound: for each β, minimize I(X;T) − β·I(T;Y). Deep learning traverses this curve during training.

β (tradeoff): 2.0 Source clusters: 4 Noise σ: 0.5

Blue curve = IB optimal tradeoff. Orange dot = current β. The tradeoff curve has bifurcation points where representation complexity jumps. At β→0: T is trivial (constant). At β→∞: T=X (full information).