Get free access to the in-progress manuscript of Programming Pig via the Open Feedback Publishing System (OFPS). Interact with the authors and community and provide your feedback in real-time.
The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I have some bandwidth again, I am getting back to work on several pet projects (including the Amazon EC2 Cluster).
(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]
As the use of a Bayesian probability calculation on a simple co-occurrence frequency table created from the same data has similar disambiguation capabilities, the paper also incorporates comparison of LSA with the Bayesian model.
D. Oard, D. Doermann, B. Dorr, D. He, P. Resnik, A. Weinberg, W. Byrne, S. Khudanpur, D. Yarowsky, A. Leuski and 2 other author(s). NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, page 76--78. Morristown, NJ, USA, Association for Computational Linguistics, (2003)
C. Callison-Burch, M. Osborne, and P. Koehn. Proceedings the Eleventh Conference of the European Chapter of the Association for Computational Linguistics, page 249--256. Trento, Italia, (2006)
D. Marcu, and W. Wong. EMNLP '02: Proceedings of the ACL-02 conference on Empirical methods in natural language processing, page 133--139. Morristown, NJ, USA, Association for Computational Linguistics, (2002)