Shannon Entropy
How much surprise is in a message? Shannon entropy measures the average information content of a source — the minimum bits per character needed to encode it. Type or paste text below and watch information theory come alive.
H = −Σ p(x) log₂ p(x) • I(x) = −log₂ p(x)
About this lab
In 1948, Claude Shannon founded information theory with his landmark paper "A Mathematical Theory of Communication." He showed that every source of information has a fundamental quantity called entropy that measures the average surprise, or uncertainty, per symbol.
Entropy (H) is computed as:
H = −Σ p(x) log₂ p(x)
where p(x) is the probability of character x. Key insights:
- Low entropy means the text is predictable (e.g., "aaaaaa"). Few bits needed.
- High entropy means the text is unpredictable (e.g., random characters). More bits needed.
- Maximum entropy occurs when all characters are equally likely: H_max = log₂(N) where N is the alphabet size.
- Information content (surprise) of a single character is I(x) = −log₂(p(x)). Rare characters carry more surprise.
- Compression limit: Shannon entropy sets a theoretical lower bound on lossless compression. No scheme can do better on average.
English text typically has entropy around 4–5 bits/character (considering single-character frequencies). The true entropy is lower (~1–1.5 bits) when accounting for word structure and long-range patterns.