The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I have some bandwidth again, I am getting back to work on several pet projects (including the Amazon EC2 Cluster).
(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]
As the use of a Bayesian probability calculation on a simple co-occurrence frequency table created from the same data has similar disambiguation capabilities, the paper also incorporates comparison of LSA with the Bayesian model.
L. Bungum, and S. Oepen. Proceedings of the 13th Annual Meeting of the European Association for Machine Translation (EAMT-09), Barcelona, Spain, (2009)
E. Chi, and T. Mytkowicz. HT '08: Proceedings of the nineteenth ACM conference on Hypertext and hypermedia, page 81--88. New York, NY, USA, ACM, (2008)
H. Halpin, V. Robu, and H. Shepherd. WWW '07: Proceedings of the 16th international conference on World Wide Web, page 211--220. New York, NY, USA, ACM, (2007)
S. Oldenburg, M. Garbe, and C. Cap. SSM '08: Proceeding of the 2008 ACM Workshop on Search in Social Media, page 11--18. New York, NY, USA, ACM, (October 2008)
F. Sánchez-Martínez, M. Forcada, and A. Way. Proceedings of the 3rd Workshop on Example-Based Machine Translation, page 11--18. Dublin, Ireland, Centre for Next Generation Localisation (CNGL), (2009)
S. Noël, and R. Beale. Proceedings of the 22nd British CHI Group Annual Conference on HCI 2008: People and Computers XXII: Culture, Creativity, Interaction, 2, page 71-74. (2008)
E. Rader, and R. Wash. CSCW '08: Proceedings of the ACM 2008 conference on Computer supported cooperative work, page 239--248. New York, NY, USA, ACM, (2008)
P. Heymann, G. Koutrika, and H. Garcia-Molina. WSDM '08: Proceedings of the international conference on Web search and web data mining, page 195--206. New York, NY, USA, ACM, (2008)
M. Ames, and M. Naaman. CHI '07: Proceedings of the SIGCHI conference on Human factors in computing systems, page 971--980. New York, NY, USA, ACM, (2007)
C. Marlow, M. Naaman, D. Boyd, and M. Davis. HYPERTEXT '06: Proceedings of the seventeenth conference on Hypertext and hypermedia, page 31--40. New York, NY, USA, ACM, (2006)
M. Carman, M. Baillie, R. Gwadera, and F. Crestani. SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, page 123--130. New York, NY, USA, ACM, (2009)
S. Bao, G. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. WWW '07: Proceedings of the 16th international conference on World Wide Web, page 501--510. New York, NY, USA, ACM, (2007)
L. Muñoz, S. Rojas, and M. Rosell. Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, page 75--82. Alicante, Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos, (2009)
M. Zubizarreta, F. Tyers, and G. Ramírez-Sánchez. Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, page 3--10. Alicante, Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, (2009)