Monday, February 22, 2016

Zipf's law is so simple -- and so strange

Here's a quick explanation of Zipf's Law:
Zipf's law in its simplest form, as formulated in the thirties by American linguist George Kingsley Zipf, states surprisingly that the most frequently occurring word in a text appears twice as often as the next most frequent word, three times more than the third most frequent one, four times more than the fourth most frequent one, and so on.
I mean, seriously. How can this possibly be true? And yet it is. Each one of us can write a book using whatever words occur to us, and yet our book -- and all other books -- will always comply with this law. It's as if we're being governed by something we are neither aware of, nor understand. I've known about this for a long time but every time I see it referenced, as I did today at phys.org, I'm shocked. It goes all the way down the line, too. The 96th most often-used word is used 96 times less frequently than the most frequently-used word. And on and on. How can this be?

The linked article is about researchers who recently applied Zipf's law to ancient texts, and found that they too comply with this law. This search was occasioned by the existence of Big Data in our technologically enhanced world. Prior to this, Zipf's law had only been tested on a certain number of texts. The researchers thought there was no reason to limit their investigation into the resilience of this law. So they looked into texts from all around the world, including ancient texts -- and they found that they all follow Zipf's law. In any language, in any age, Zipf's law holds true.

Each time I encounter Zipf's law, I am shocked anew. It's as if there's a whole 'nother level of rules that guide our lives -- rules we neither sense nor grasp, and yet we use them unfailingly. It's just amazing.

PS: When quantum computers are realized, I suspect we'll see lots of new correlations in the world around us. What we are aware of is so much less than what's really there. It's exciting and as I keep saying, very strange.