Speech technology potentially allows everyone to participate in today's information revolution and can bridge the language barrier gap. Unfortunately, construction of speech processing systems requires significant resources. With some 6900 languages in the world, traditionally speech processing is prohibitive to all but the most economically viable languages. In spite of recent improvements in speech processing, supporting new languages is a skilled job requiring significant effort from trained individuals. SPICE aims to overcome both limitations by providing an interactive language creation and evaluation toolkit that allows everyone to develop speech processing models, to collect appropriate data for model building, and to evaluate the results enabling iterative improvements.
ConceptNet is a freely available commonsense knowledgebase and natural-language-processing toolkit which supports many practical textual-reasoning tasks over real-world documents right out-of-the-box (without additional statistical training) including
Finding important information in unstructured text
From Language and Information Technologies
Jump to: navigation, search
A vast majority of the information we deal with in everyday life consists of raw, unstructured text, where the most important facts or concepts are not always readily available, but hidden in the myriad of details that accompany them. To handle and digest the sheer amount of information we are exposed to in this information age, more sophisticated procedures are required to unveil the important parts of a text, and to allow us to process more information in less time. The goal of this project is to develop robust and accurate techniques to automatically extract important information from unstructured text, in the form of keyphrases (keyphrase extraction) or entire sentences (extractive summarization).
Funded by Google
[edit]
L. Rino, T. Pardo, C. Silla Jr., C. Kaestner, and M. Pombo. Proceedings of the 17th Brazilian Symposium on Artificial Intelligence (SBIA), page 235-244. São Luis-MA, Brazil, (September 2004)
V. Nastase, and S. Szpakowicz. Proceedings of TextGraphs: the Second Workshop on Graph Based Methods for Natural Language Processing, page 29-32. New York City, Association for Computational Linguistics, (June 2006)
J. Kupiec, J. Pedersen, and F. Chen. Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 68-73. New York, NY, USA, ACM Press, (1995)
G. Muzny, M. Fang, A. Chang, and D. Jurafsky. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, page 460--470. Valencia, Spain, Association for Computational Linguistics, (April 2017)
W. Barczynski, F. Foester, F. Brauer, and D. Schuster. to appear in: Proceedings of the 12th International Conference on Enterprise Information Systems (ICEIS-2010), Funchal, Portugal, (8 - 12 June 2010)
T. Zesch, and I. Gurevych. Proceedings of the TextGraphs-2 Workshop (NAACL-HLT), page 1--8. Rochester, Association for Computational Linguistics, (April 2007)
J. Larocca Neto, A. Freitas, and C. Kaestner. Proceedings of the 16th Brazilian Symposium on Artificial Intelligence (SBIA), volume 2507 of LNAI, page 205-215. Springer-Verlag, (November 2002)
U. Priss, and L. Old. Proceedings of the 15th International Conference on Conceptual Structures (ICCS 2007), volume 4604 of Lecture Notes in Artificial Intelligence, page 310-320. Berlin, Heidelberg, Springer-Verlag, (July 2007)
R. Snow, B. O'Connor, D. Jurafsky, and A. Ng. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, page 254--263. Honolulu, Hawaii, Association for Computational Linguistics, (October 2008)
M. Schwab, R. Jäschke, and F. Fischer. Proceedings of the 6th International Conference on Natural Language and Speech Processing, page 99--109. Association for Computational Linguistics, (2023)