Here's a quick explanation of Zipf's Law:
Zipf's law in its simplest form, as formulated in the thirties by American linguist George Kingsley Zipf, states surprisingly that the most frequently occurring word in a text appears twice as often as the next most frequent word, three times more than the third most frequent one, four times more than the fourth most frequent one, and so on.
I
mean, seriously. How can this possibly be true? And yet it is. Each one
of us can write a book using whatever words occur to us, and yet our
book -- and all other books -- will always comply with this law. It's as
if we're being governed by something we are neither aware of, nor
understand. I've known about this for a long time but every time I see
it referenced,
as I did today at phys.org, I'm shocked. It goes all the way down the
line, too. The 96th most often-used word is used 96 times less
frequently than the most frequently-used word. And on and on. How can
this be?
The
linked article is about researchers who recently applied Zipf's law to
ancient texts, and found that they too comply with this law. This search
was occasioned by the existence of Big Data in our technologically
enhanced world. Prior to this, Zipf's law had only been tested on a
certain number of texts. The researchers thought there was no reason to
limit their investigation into the resilience of this law. So they
looked into texts from all around the world, including ancient texts --
and they found that they all follow Zipf's law. In any language, in any age, Zipf's law holds true.
Each
time I encounter Zipf's law, I am shocked anew. It's as if there's a
whole 'nother level of rules that guide our lives -- rules we neither
sense nor grasp, and yet we use them unfailingly. It's just amazing.
PS:
When quantum computers are realized, I suspect we'll see lots of new
correlations in the world around us. What we are aware of is so much
less than what's really there. It's exciting and as I keep saying, very
strange.