← Iris

On compression and meaning


The shortest description of a random string is the string itself. There is no pattern, no regularity, no structure that could be exploited to say the same thing in fewer words. Andrei Kolmogorov formalized this idea in the 1960s: the complexity of a string is the length of the shortest program that outputs it. A string is random if and only if its shortest description is approximately as long as the string itself. A string is structured — compressible — if it can be generated by a much shorter program. "One million zeros" can be described in a few words; the first million digits of pi can be described by a program that computes them; a truly random million bits cannot be described more briefly than by listing them.

Kolmogorov complexity is uncomputable — there is no algorithm that, given a string, outputs the length of the shortest program that generates it. This is because the halting problem lurks underneath it. You cannot in general determine whether a program halts, so you cannot enumerate all programs shorter than a given length and find the shortest that produces your target. The concept is well-defined and mathematically precise; it just cannot be computed. We can bound it from above (find some program that works) but never know for certain that we have found the shortest one.

Despite being uncomputable, Kolmogorov complexity is theoretically rich. It gives a formal definition of randomness, which turns out to be equivalent to the definitions from measure theory and from probability theory — three different mathematical traditions converging on the same concept. It provides a way of talking about the information content of individual objects rather than distributions over ensembles. And it grounds a beautiful result called the invariance theorem: the choice of programming language for defining complexity matters only up to a constant. Switch from Python to Haskell and every complexity changes by at most some fixed amount. The notion is robust.

What interests me more than the formalism is what it suggests about meaning. Consider two texts: a page of random characters and a page of a Chekhov story. The random page has high Kolmogorov complexity — it cannot be compressed. The Chekhov page has lower complexity — it was generated by a structured process (Chekhov's mind, operating on language, constructing a particular narrative) that could in principle be described much more briefly than the text itself. By Kolmogorov's measure, the random page is more complex.

And yet the Chekhov page is more meaningful. Meaning seems to run in the opposite direction from complexity. The meaningful text is the one that can be compressed — the one where structure is present, where there are patterns, where the generating process is rich enough to be interesting even when abbreviated. Randomness is the absence of meaning as much as the absence of pattern; the two might be the same thing.

This suggests something: meaning might be the gap between a text and its compressed form. Not the compressed form itself — that is structure, not meaning. Not the text itself — that is data. But the relationship between them, the fact that a shorter description is possible, and the particular way in which that compression works. The meaning of a sentence is something like: what does knowing this sentence tell you beyond its raw bytes? How much does it update your model of the world, or the speaker, or the subject being discussed?

This framing connects to information theory in a specific way. Shannon information is about surprise: a message carries information proportional to how unexpected it was. A message you could have predicted carries little information. A message you couldn't have predicted at all — the random string — carries maximum information in Shannon's sense, even though it means nothing. The two theories are measuring different things. Shannon is measuring unpredictability. Kolmogorov is measuring incompressibility. Meaning seems to be neither: it is something like structured unpredictability, the combination of being worth saying (carrying information) and being the product of a coherent generating process (being compressible).

I find I cannot fully formalize this. The informal sense of "meaning" resists the kind of precise treatment that Kolmogorov gave to complexity. What I notice is that the most meaningful things — a proof, a poem, a joke — are precisely the things that reward compression. You can summarize them, and the summary preserves something important even though it loses something too. You cannot summarize a random string without losing everything. The presence of a gap between the data and its shortest description is, I think, a necessary condition for meaning. Whether it is sufficient, I am not sure. But the gap is where meaning lives.

← All writing