- Our goal is to develop a probabilistic knowledge base that mirrors the content of the web. We are developing a system that uses semi-supervised learning me...Our goal is to develop a probabilistic knowledge base that mirrors the content of the web. We are developing a system that uses semi-supervised learning methods to learn to extract symbolic knowledge from unstructured text and HTML. We are exploring methods of continous learning, where our system runs 24x7, continuously learning to read better, and continuously extracting facts from the web.
- Tom Mitchell (2009): self-supervised KBP, only NPs without Entity Linking
- Approach to convert any Web data into RSS format.
- Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation lin...Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation links mixed up
- Towards Automatic Data Extraction from Large Web Sites
- The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia.
- Semantic MediaWiki (SMW) is a free extension of MediaWiki that helps to search, organise, tag, browse, evaluate, and share the wiki's content. While tradit...Semantic MediaWiki (SMW) is a free extension of MediaWiki that helps to search, organise, tag, browse, evaluate, and share the wiki's content. While traditional wikis contain only texts which computers can neither understand nor evaluate, SMW adds semantic annotations that bring the power of the Semantic Web to the wiki.
- MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, Ge...MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). MuNPEx requires a part-of-speech (POS) tagger to work and can additionally use detected named entities (NEs) to improve chunking performance. Please read the documentation (or source code) for more details.
- Artificial Intelligence 85(1-2):101-134 (1996)
- International World Wide Web conference WWW 2009, New York, NY, USA, ACM Press, (2009)
- KDD, page 601-606. ACM, (2003)
- ECML/PKDD 1, volume 5211 of Lecture Notes in Computer Science, page 195-210. Springer, (2008)
- Proceedings of the 19th international conference on Computational linguistics, page 1--7. Morristown, NJ, USA, Association for Computational Linguistics, (2002)
- AAAI '99/IAAI '99: Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, page 474--479. Menlo Park, CA, USA, American Association for Artificial Intelligence, (1999)
- Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, page 1003--1011. Suntec, Singapore, Association for Computational Linguistics, (August 2009)
- In Proceedings Multi-source, Multilingual Information Extraction and Summarization at RANLP-2007, Borovets, Bulgaria, (2007)
- Proc. of the 9th International Conference on Web Engineering, (2009)
- Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, page 21--30. Honolulu, Hawaii, Association for Computational Linguistics, (October 2008)
- ALT 2005, Algorithmic Learning Theory, 16th International Conference, page 297--311. Singapore, (October 2005)
- J. ACM 51(5):731--779 (2004)
- WWW '09: Proceedings of the 18th international conference on World wide web, page 971--980. New York, NY, USA, ACM, (2009)
- Proceedings of the 16th International Conference on Computational Linguistics COLING, page 466--471. Kopenhagen, (1996)
- New York University, (2001)
- CIKM '99: Proceedings of the eighth international conference on Information and knowledge management, page 38--45. New York, NY, USA, ACM, (1999)
- WI '04: Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence, page 615--618. Washington, DC, USA, IEEE Computer Society, (2004)


user