Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited.

Abstract

Zipf’s law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real exponent of the probability density function and statistics on a big corpus, make evident that word frequency as a function of the rank follows two different exponents, ˜(-)1 for the first regime and ˜(-)2 for the second. The implications of the change in exponents for the metrics of texts and for the origins of complex lexicons are analyzed.

BibTeX key: journals/jql/CanchoS01
entry type: article
year: 2001
journal: Journal of Quantitative Linguistics
number: 3
pages: 165-173
volume: 8
ee: http://dx.doi.org/10.1076/jqul.8.3.165.4101
url: http://dblp.uni-trier.de/db/journals/jql/jql8.html#CanchoS01

BibSonomy

Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf's Law Revisited.

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on