Abstract
We have found a method to automatically extract the
meaning of words and phrases from the world-wide-web
using Google page counts. The approach is novel in its
unrestricted problem domain, simplicity of
implementation, and manifestly ontological
underpinnings. The world-wide-web is the largest
database on earth, and the latent semantic context
information entered by millions of independent users
averages out to provide automatic meaning of useful
quality. We demonstrate positive correlations,
evidencing an underlying semantic structure, in both
numerical symbol notations and number-name words in a
variety of natural languages and contexts. Next, we
demonstrate the ability to distinguish between colours
and numbers, and to distinguish between 17th century
Dutch painters; the ability to understand electrical
terms, religious terms, and emergency incidents; we
conduct a massive experiment in understanding WordNet
categories; and finally we demonstrate the ability to
do a simple automatic English-Spanish translation.
Users
Please
log in to take part in the discussion (add own reviews or comments).