The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I have some bandwidth again, I am getting back to work on several pet projects (including the Amazon EC2 Cluster).
(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]
As the use of a Bayesian probability calculation on a simple co-occurrence frequency table created from the same data has similar disambiguation capabilities, the paper also incorporates comparison of LSA with the Bayesian model.
A. Hadgu, A. Aregawi, and A. Beaudoin. (2021)cite arxiv:2112.08191Comment: 4 pages, 2 figures, 35th Conference on Neural Information Processing Systems (NeurIPS 2021) demonstrations track.
A. Berard, O. Pietquin, C. Servan, and L. Besacier. (2016)cite arxiv:1612.01744Comment: accepted to NIPS workshop on End-to-end Learning for Speech and Audio Processing.