The TextMarker system is a rule-based tool for information extraction and text processing tasks. The comprehensible rule language can be easily extended and supports several scripting functionalities. TextMarker uses DLTK and UIMA.
The Natural Programming Project is working on making programming languages and environments easier to learn, more effective, and less error prone. We are taking a human-centered approach, first studying how people perform their tasks and then designing languages and environments around people's natural tendencies. We focus on all kinds of programming, including professional programmers, novice programmers who are trying to learn to be experts, and end users, who program to support other jobs or hobbies, such as multimedia authoring, simulations, teaching, prototyping, and other activities supported by computing.
(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]
This relates to the recent Slashdot-posted paper about the world being a VR. If indeed human mind is non-computable, the world can't be VR. Cf. On Intelligence.
I am investigating computational models for linguistic structures and processes, with application to language technologies and to the documentation of endangered languages. My current focus is on efficient query for databases of hierarchically annotated data. After completing a PhD on computational phonology at the University of Edinburgh in 1990, I worked on a series of European research projects and conducted linguistic fieldwork in Cameroon with SIL. In 1998 I moved to the University of Pennsylvania, becoming Associate Director of the LDC, and working on models and tools for linguistic annotation. In 2002 I returned home to Australia and established the Melbourne University Language Technology Group. In 2007 I was awarded the Kelvin Medal for excellence in teaching.
Key Activities: Coordinating first year Informatics; developing the Natural Language Toolkit; writing a textbook on NLP; leading the Language Technology Group; working on an NSF project on Querying Linguistic Databases; and editing Cambridge Studies in Natural Language Processing and the ACL Anthology.
Key Publications: Natural Language Processing in Python; Computational phonology: A constraint-based approach (Cambridge); A formal framework for linguistic annotation (Speech Communication); Seven dimensions of portability for language documentation and description (Language); Designing and evaluating an XPath dialect for linguistic queries (ICDE).
* Морфология и компьютерная лингвистика для самых маленьких
* Роль морфологии в компьютерной лингвистике
* Морфология. Задачи и подходы к их решению
* Псевдолемматизация, композиты и прочие странные словечки
T. Völker, J. Pfister, T. Koopmann, und A. Hotho. (2024)cite arxiv:2401.09092Comment: Accepted at 2024 ACM SIGIR CHIIR, For a demo see here http://professor-x.de/demos/bibsonomy-chatgpt/demo.mp4.
T. Völker, J. Pfister, T. Koopmann, und A. Hotho. (2024)cite arxiv:2401.09092Comment: Accepted at 2024 ACM SIGIR CHIIR, For a demo see here http://professor-x.de/demos/bibsonomy-chatgpt/demo.mp4.
T. Völker, J. Pfister, T. Koopmann, und A. Hotho. (2024)cite arxiv:2401.09092Comment: Accepted at 2024 ACM SIGIR CHIIR, For a demo see here http://professor-x.de/demos/bibsonomy-chatgpt/demo.mp4.
Y. Lu, J. Li, X. Wang, H. Shi, T. Chen, und S. Tang. Findings of the Association for Computational Linguistics: EMNLP 2023, Seite 7447--7457. Singapore, Association for Computational Linguistics, (Dezember 2023)
M. Sengupta. Findings of the Association for Computational Linguistics: EMNLP 2023, Seite 4636–4659. Association for Computational Linguistics (ACL), (Dezember 2023)
R. Pryzant, D. Iter, J. Li, Y. Lee, C. Zhu, und M. Zeng. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Seite 7957--7968. Singapore, Association for Computational Linguistics, (Dezember 2023)
S. Syed, T. Ziegenbein, P. Heinisch, H. Wachsmuth, und M. Potthast. Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Seite 114--129. Prague, Czechia, Association for Computational Linguistics, (September 2023)