Zipf's Law
The most common word in English text appears roughly twice as often as the second most common, three times as often as the third, and so on. This power-law relationship between rank and frequency — Zipf's law — is one of the most ubiquitous patterns in language. Paste any text below to see it in action.
Zipf's law, named after linguist George Kingsley Zipf (1902–1950), states that the frequency of a word in a large corpus is inversely proportional to its rank. On a log-log plot, this appears as a straight line with slope close to −1.
The phenomenon extends far beyond language. City populations, income distributions, earthquake magnitudes, website traffic, and even the sizes of moon craters all follow approximate power laws. Why such universality? Several generative mechanisms have been proposed:
Herbert Simon (1955) showed that a "rich get richer" process — where new words are added with probability proportional to existing frequency — naturally produces Zipfian distributions. This is related to Yule's process and preferential attachment in network science.
An alternative view comes from information theory: Zipf's law may reflect an optimal balance between speaker effort (using few, common words) and listener effort (having precise, rare words). The power-law exponent α ≈ 1 sits at the sweet spot.
The fitted exponent α varies by language and genre. English prose typically gives α ≈ 1.0–1.1. Code and technical writing may show different exponents because of their constrained vocabularies. Random text (uniform word choice) will show a much flatter distribution.