Shannon Entropy Visualizer

Probability Distribution Editor

Shannon Entropy H(X)2.000 bits

Max possible entropy2.000 bits

Redundancy0.0%

Effective alphabet size 2^H4.00

Entropy vs Probability (Binary Channel)

H(p) = −p·log₂(p) − (1−p)·log₂(1−p) peaks at p=0.5 with H=1 bit

Mutual Information & Channel Capacity

I(X;Y) = H(X) − H(X|Y) — information shared between input and output

Distribution Histogram

Entropy vs Distribution Shape

Shannon's Information Theory (1948)

Claude Shannon defined entropy as H(X) = −Σ pᵢ log₂ pᵢ bits — the average surprise (information) per symbol. A fair coin flip = 1 bit. A fair die = log₂(6) ≈ 2.585 bits.

Maximum entropy is achieved by the uniform distribution: H_max = log₂(n) for n symbols. Any departure from uniformity reduces entropy.

Channel capacity C = max_{p(x)} I(X;Y) — the maximum rate at which information can be reliably transmitted. Shannon's channel coding theorem: transmission at rate R < C is possible with arbitrarily small error probability.

Mutual information I(X;Y) = H(X) + H(Y) − H(X,Y) = H(X) − H(X|Y) — reduction in uncertainty about X given knowledge of Y. Zero for independent variables, equals H(X) for perfect channels.

The binary entropy function H(p) = −p log₂p − (1−p)log₂(1−p) is concave, symmetric about p=0.5, with maximum 1 bit at p=0.5 and minimum 0 at p=0 or 1.