Get free access to the in-progress manuscript of Programming Pig via the Open Feedback Publishing System (OFPS). Interact with the authors and community and provide your feedback in real-time.
The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I have some bandwidth again, I am getting back to work on several pet projects (including the Amazon EC2 Cluster).
(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]
As the use of a Bayesian probability calculation on a simple co-occurrence frequency table created from the same data has similar disambiguation capabilities, the paper also incorporates comparison of LSA with the Bayesian model.
S. McNaught. Proceedings of the 3rd International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language, (1990)