Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization" [1]. It was primarily developed for language guessing, a task on which it is known to perform with near-pe
Chomsky bot written in Ruby. A funny little thing which generates random paragraphs of text from a set sentence building blocks. It combines four kinds of phrases (introduction phrases, subject phrases, verb phrases and object phrases) into a sentence. The sentences this simple construction can create are amazing. They are syntactically correct and "hovers on the edge on understandability".
M. Schwab, R. Jäschke, und F. Fischer. Proceedings of the 6th International Conference on Natural Language and Speech Processing, Seite 99--109. Association for Computational Linguistics, (2023)
R. Karimi Mahabadi, J. Henderson, und S. Ruder. Advances in Neural Information Processing Systems, 34, Seite 1022--1035. Curran Associates, Inc., (2021)
P. Xia, S. Wu, und B. Van Durme. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Seite 7516--7533. Association for Computational Linguistics, (November 2020)
S. Bordia, und S. Bowman. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, Seite 7--15. Minneapolis, Minnesota, Association for Computational Linguistics, (Juni 2019)
S. Blodgett, S. Barocas, H. Daumé III, und H. Wallach. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Seite 5454--5476. Online, Association for Computational Linguistics, (Juli 2020)