SmartGWT is a GWT based framework that allows you to not only utilize its comprehensive widget library for your application UI, but also tie these widgets in with your server-side for data management. SmartGWT is based on the powerful and mature SmartClient library.
Joda-Time provides a quality replacement for the Java date and time classes. The design allows for multiple calendar systems, while still providing a simple API. The 'default' calendar is the ISO8601 standard which is used by XML. The Gregorian, Julian, Buddhist, Coptic, Ethiopic and Islamic systems are also included, and we welcome further additions. Supporting classes include time zone, duration, format and parsing.
The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format, and Office OpenXML format, using pure Java. In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java.
PojoCache is an in-memory, transactional, and replicated POJO (plain old Java object) cache system that allows users to operate on a POJO transparently without active user management of either replication or persistency aspects. This tutorial focuses on the usage of the PojoCache API.
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, a web administration interface and many more features. It runs in a Java servlet container such as Tomcat.
This is the project page for SecondString, an open-source Java-based package of approximate string-matching techniques. This code was developed by researchers at Carnegie Mellon University from the Center for Automated Learning and Discovery, the Department of Statistics, and the Center for Computer and Communications Security.
It contains a Web Crawler, HTML Parser and ("in the near future") NER and REX.
Additionally, including JWikiDocs, a Java tool for crawling and downloading Wikipedia documents.
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package.
It is a fast real-time parser for real-world HTML. What has attracted most developers to HTMLParser has been its simplicity in design, speed and ability to handle streaming real-world html.
SVM-JAVA, developed for research and educational purpose, is a Java implementation of John C. Platt's sequential minimal optimization (SMO) for training a support vector machine (SVM). This program is based on the pseudocode in "Fast Training of Support Vector Machines using Sequential Minimal Optimization" by John C. Platt and in "Sequential Minimal Optimization for SVM" by Xianping Ge. It currently supports linear and RBF kernels.
This software is an extension of the SVMlight software. It provides an interface to kernel functions that are implemented in Java by means of the Java Native Interface (JNI) Invocation API.
OpenNLP is an organizational center for open source projects related to natural language processing. It hosts a variety of java-based NLP tools which perform sentence detection, tokenization, pos-tagging, chunking and parsing, named-entity detection, and coreference using the OpenNLP Maxent machine learning package.
ASV Toolbox is a modular collection of tools for the exploration of written language data. They work either on word lists or text and solve several linguistic classification and clustering tasks. The topics covered contain language detection, POS-tagging, base form reduction, named entity recognition, and terminology extraction.