Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. For more information about Tika, please see the list of supported document formats and the available documentation . You can find the latest release on the download page . See the Getting Started guide for instructions on how to start using Tika.
Tika is a subproject of Apache Lucene . Lucene is a project of the Apache Software Foundation .
Step Towards Disease Outbreak Information Extraction: Automatic ...
http://naist.cpe.ku.ac.th/SlideSNLP2007/131207/A%20Step%20Towards%20Disease%20Outbreak%20Information%20Extraction%20Automatic%20Entity%20Role%20Recognition%20for%20Named%20Entities.pdf
A technique for studying disorder in quantum systems is able to spot significant patterns in large data sets such as web pages, and may be adaptable to
Y. Jin, Y. Matsuo, and M. Ishizuka. Proceedings of the European Semantic Web Conference, ESWC2007, volume 4519 of Lecture Notes in Computer Science, Springer-Verlag, (July 2007)
M. Kayed, and K. Shaalan. IEEE Transactions on Knowledge and Data Engineering, 18 (10):
1411--1428(2006)Member-Chia-Hui Chang and Member-Moheb Ramzy Girgis.
H. Han, C. Giles, E. Manavoglu, H. Zha, Z. Zhang, and E. Fox. JCDL '03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, page 37--48. Washington, DC, USA, IEEE Computer Society, (2003)
A. Takasu. JCDL '03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, page 49--60. Washington, DC, USA, IEEE Computer Society, (2003)
S. Huffman. Connectionist, Statistical, And Symbol Approaches to Learning for
Natural Language Processing, volume 1040, page 246-260. Springer, (1996)
A. Takasu. JCDL '03: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, page 49--60. Washington, DC, USA, IEEE Computer Society, (2003)
G. Gottlob, C. Koch, R. Baumgartner, M. Herzog, and S. Flesca. Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium
on Principles of Database Systems, June 14-16, 2004, Paris, France, page 1-12. ACM, (2004)