Web content mining is related but different from data mining and text mining. It is related to data mining because many data mining techniques can be applied in Web content mining. It is related to text mining because much of the web contents are texts. H
TeSSI® (Terminology Supported Semantic Indexing) is a state-of-the-art tool that improves upon the existing search and retrieval tools by extracting the meaning out of medical free text and placing the resulting medical ‘concepts’ in the document...
This is the home page of the ParsCit project, which performs reference string parsing, sometimes also called citation parsing or citation extraction. It is architected as a supervised machine learning procedure that uses Conditional Random Fields as its learning mechanism. You can download the code below, parse strings online, or send batch jobs to our web service (coming soon!). The code contains both the training data, feature generator and shell scripts to connect the system to a web service (used here too).
M. Schwab, R. Jäschke, and F. Fischer. Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, page 110--115. Association for Computational Linguistics, (2023)
F. Arnold, and R. Jäschke. Proceedings of the Workshop Understanding LIterature references in academic full TExt at JCDL 2022, volume 3220 of ULITE-ws '22, page 7--15. CEUR Workshop Proceedings, (2022)
M. Schwab, R. Jäschke, and F. Fischer. Proceedings of the 5th International Conference on Natural Language and Speech Processing, page 282--287. Association for Computational Linguistics, (2022)
J. Rotsztejn, N. Hollenstein, and C. Zhang. (2018)cite arxiv:1804.02042Comment: Accepted to SemEval 2018 (12th International Workshop on Semantic Evaluation).
G. Muzny, M. Fang, A. Chang, and D. Jurafsky. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, page 460--470. Valencia, Spain, Association for Computational Linguistics, (April 2017)
D. Knoell, M. Atzmueller, C. Rieder, and K. Scherer. Proc. GWEM 2017, co-located with 9th Conference Professional Knowledge Management (WM 2017), Karlsruhe, Germany, KIT, ((In Press) 2017)
N. Peng, H. Poon, C. Quirk, K. Toutanova, and W. Yih. ACL, (2017)cite arxiv:1708.03743Comment: Conditional accepted by TACL in December 2016; published in April 2017; presented at ACL in August 2017.
D. Dligach, T. Miller, C. Lin, S. Bethard, and G. Savova. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2, page 746--751. (2017)
P. Ludwig, M. Thiel, and A. Nürnberger. Semantic Keyword-Based Search on Structured Data Sources: COST Action IC1302 Second International KEYSTONE Conference (IKC 2016) Revised Selected Papers, page 37-48. Cham, Springer International Publishing, (2017)