Libtextcat is a library with functions that implement the classification technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization" [1]. It was primarily developed for language guessing, a task on which it is known to perform with near-pe
Chomsky bot written in Ruby. A funny little thing which generates random paragraphs of text from a set sentence building blocks. It combines four kinds of phrases (introduction phrases, subject phrases, verb phrases and object phrases) into a sentence. The sentences this simple construction can create are amazing. They are syntactically correct and "hovers on the edge on understandability".
M. Schwab, R. Jäschke, и F. Fischer. Proceedings of the 6th International Conference on Natural Language and Speech Processing, стр. 99--109. Association for Computational Linguistics, (2023)
P. Xia, S. Wu, и B. Van Durme. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), стр. 7516--7533. Association for Computational Linguistics, (ноября 2020)
S. Bordia, и S. Bowman. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, стр. 7--15. Minneapolis, Minnesota, Association for Computational Linguistics, (июня 2019)
S. Blodgett, S. Barocas, H. Daumé III, и H. Wallach. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, стр. 5454--5476. Online, Association for Computational Linguistics, (июля 2020)