Shannon-Hartley Theorem: The channel capacity C = B·log₂(1 + S/N) gives the maximum
achievable data rate over a band-limited channel with additive white Gaussian noise (AWGN).
B is the bandwidth in Hz, S/N is the signal-to-noise ratio (linear). This is a fundamental limit —
no coding scheme can exceed C bits per second with arbitrarily low error probability (Shannon 1948).
Mutual information I(X;Y) = H(Y) − H(Y|X) measures how much information the output Y tells
us about the input X. Capacity is the maximum mutual information over all input distributions.
The key insight: doubling bandwidth doubles capacity linearly, but doubling SNR gives only
logarithmic gain — hence bandwidth is more valuable than power at high SNR.