Inproceedings,

An unsupervised model for text message normalization

, and .
CALC '09: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity, page 71--78. Morristown, NJ, USA, Association for Computational Linguistics, (2009)

Abstract

Cell phone text messaging users express themselves briefly and colloquially using a variety of creative forms. We analyze a sample of creative, non-standard text message word forms to determine frequent word formation processes in texting language. Drawing on these observations, we construct an unsupervised noisy-channel model for text message normalization. On a test set of 303 text message forms that differ from their standard form, our model achieves 59% accuracy, which is on par with the best supervised results reported on this dataset.

Tags

Users

  • @zhenzhenx

Comments and Reviews