Shannon Channel Capacity — Noisy Channel Coding Theorem

C = bits | H(X) = | I(X;Y) =
Shannon's channel coding theorem (1948): the capacity C is the maximum mutual information I(X;Y) over all input distributions. BSC: C = 1 − H(p) where H(p) = −p log p − (1−p) log(1−p). BEC: C = 1 − p. AWGN: C = ½log₂(1+SNR). The channel transition matrix (right) shows the stochastic map from inputs to outputs.