eXist-db is an open source database management system built using XML technology. It stores XML data according to the XML data model and features efficient, index-based XQuery processing.
Intel® Threading Building Blocks (TBB) offers a rich and complete approach to expressing parallelism in a C++ program. It is a library that helps you take advantage of multi-core processor performance without having to be a threading expert. Threading Building Blocks is not just a threads-replacement library. It represents a higher-level, task-based parallelism that abstracts platform details and threading mechanisms for scalability and performance.
By participating in the Pentaho Community, you will be joining people from all over the world, working together to build the worlds first open source, best-in-class solution for enterprise Business Intelligence (BI). This community web site is the place to meet, find resources and become part of the creation process. The best way to get started is to sign up for the forums and read the How to Contribute page.
It contains a Web Crawler, HTML Parser and ("in the near future") NER and REX.
Additionally, including JWikiDocs, a Java tool for crawling and downloading Wikipedia documents.