Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. For more information about Tika, please see the list of supported document formats and the available documentation . You can find the latest release on the download page . See the Getting Started guide for instructions on how to start using Tika.
Tika is a subproject of Apache Lucene . Lucene is a project of the Apache Software Foundation .
Step Towards Disease Outbreak Information Extraction: Automatic ...
http://naist.cpe.ku.ac.th/SlideSNLP2007/131207/A%20Step%20Towards%20Disease%20Outbreak%20Information%20Extraction%20Automatic%20Entity%20Role%20Recognition%20for%20Named%20Entities.pdf
A technique for studying disorder in quantum systems is able to spot significant patterns in large data sets such as web pages, and may be adaptable to
M. Javidi, and E. Roshan. Speech Emotion Recognition by Using Combinations of Support Vector Machine (SVM), and C5.0, 1, page 21 - 33. Applied Mathematics and Sciences: An International Journal (MathSJ), (August 2014)
G. Muzny, M. Fang, A. Chang, and D. Jurafsky. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, page 460--470. Valencia, Spain, Association for Computational Linguistics, (April 2017)
C. Scheible, R. Klinger, and S. Padó. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), page 1736--1745. Berlin, Germany, Association for Computational Linguistics, (August 2016)