click here for the plus home page
© 1997-2009, Millennium Mathematics Project, University of Cambridge.
Permission is granted to print and copy this page on paper for non-commercial use. For other uses, including electronic redistribution, please contact us.
Seven things everyone wants to know about the universe
icon

What would you like to know about your universe?

Careers with maths
icon

Andy Green is gearing up to break the land speed record in his rocket powered car Bloodhound SSC

A favourite from the archive...
Subscribe to our RSS feed:
AddThis Feed Button subscribe to our RSS feed
 

Thursday, August 21, 2008

The mystery of Zipf

Zipf's law arose out of an analysis of language by linguist George Kingsley Zipf, who theorised that given a large body of language (that is, a long book — or every word uttered by Plus employees during the day), the frequency of each word is close to inversely proportional to its rank in the frequency table. We thought we would test this out on Plus. What does this imply about how we use language and how it evolved?

Read more ...

Labels:

1 Comments:

At 3:41 PM, Blogger Cassandra said...

Is it really a mystery? I have at least one idea off the top of my head.

Since one of the ways you can construct power law distributed networks (competitive scale networks) is through growth/decay rules (e.g. the next added link will have the highest probability of connecting to the node with the higher degree or existing connections) and thinking a little about how language evolves by adopting and abandoning words, it seems likely that words frequency could follow a power law because they are added to and removed from over time with a similar set of rules (at some level).

The only question is what exactly do such network nodes and their degrees map to?

Nodes seems map to words or perhaps the idea represented by the word or word-sound or word-ideas. If the nodes map to ideas then there is also a link to memes and various mind-external scale-free structures.

Nodal degree seems to related to usage of the word - either simply the frequency of usage or something deeper that results in that frequency.

 

Post a Comment

<< Home