MALLET is an integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text.
The Fat Jar Eclipse Plug-In is a Deployment-Tool which deploys an Eclipse java-project into one executable jar.
It adds the Entry "Build Fat-JAR" to the Export-Wizard.
In addition to the eclipse standard jar-exporter referenced classes and jars are included to the "Fat-Jar", so the resulting jar contains all needed classes and can be executed directly with "java -jar", no classpath has to be set, no additional jars have to be deployed.
This is the project page for SecondString, an open-source Java-based package of approximate string-matching techniques. This code was developed by researchers at Carnegie Mellon University from the Center for Automated Learning and Discovery, the Department of Statistics, and the Center for Computer and Communications Security.
SecondString is intended primarily for researchers in information integration and other scientists. It does or will include a range of string-matching methods from a variety of communities, including statistics, artificial intelligence, information retrieval, and databases. It also includes tools for systematically evaluating performance on test data. It is not designed for use on very large data sets.
ZXing (pronounced "zebra crossing") is an open-source, multi-format 1D/2D barcode image processing library implemented in Java. Our focus is on using the built-in camera on mobile phones to photograph and decode barcodes on the device, without communicating with a server. We currently have production-quality support for:
Axis2: Why bother? The Axis team is kicking up a big fuss about their recent release of Axis 2 (1.0!) Surprisingly, this library is so so abysmally bad, that I have yet to find someone who has managed to successfully use it.
A Java Audio File Tagger , using the freedb online database for the retrieval of the tags, released under the (L)GPL license. It supports custom file renaming from tags (with any directory stucture) and vice versa. Supports: mp3, ogg, flac, mpc, ape,wma
The java configurator located under org.policy.config is a powerful way of intializing a system. In addition, it is possible to load the application with different properties and even with a completely different functionality without having to recompile the code. This HOW-TO describes how this configurator can be used.
Markov Logic Networks (MLNs) is a powerful framework that combines statistical and logical reasoning; they have been applied to many data intensive problems including information extraction, entity resolution, text mining, and natural language processing. Based on principled data management techniques, Tuffy is an MLN inference engine that achieves scalability and orders of magnitude speedup compared to prior art implementations. It is written in Java and relies on PostgreSQL. For a brief introduction to MLNs and the technical details of Tuffy, please see our technical report.
Areca-Backup is an open-source, easy to use and relyable backup solution for Linux and Windows that performs incremental, differential, delta and mirror backups on local hard drives, remote directories or FTP/FTPs servers.
This software is a translation into C++ of the excellent Webgraph library by P. Boldi and S. Vigna. The original library, written in Java, is easy to use but hampered by some requirements of the Java virtual machine. This C++ translation attempts to preserve much of the ease of use (through integration with the Boost Graph Library), but bypass requirements imposed by a virtual machine.
Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.
I got this idea to create a servlet filter, that would inspect the thread-local store for the thread currently processing the request and log any thread-local references that exist before the request is dispatched down the chain and also when it comes back. Such a servlet could be packaged as a Confluence Servlet Filter Plugin, so that it is convenient to develop and deploy it.
ROME is an set of open source Java tools for parsing, generating and publishing RSS and Atom feeds. The core ROME library depends only on the JDOM XML parser and supports parsing, generating and converting all of the popular RSS and Atom formats including RSS 0.90, RSS 0.91 Netscape, RSS 0.91 Userland, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom 0.3, and Atom 1.0. You can parse to an RSS object model, an Atom object model or an abstract SyndFeed model that can model either family of formats.
In this excerpt, one of a series from Java Network Programming, 3rd Edition, Elliotte Rusty Harold demonstrates Java's handling of URLs, URIs, proxy servers, password protection, and HTTP GET.
Mockito is a mocking framework that tastes really good. It lets you write beautiful tests with clean & simple API. Mockito doesn't give you hangover because the tests are very readable and they produce clean verification errors.
With the advent of the semantic web, several projects have started to translate this bibliographic information to RDF. bibtex2rdf is a highly configurable translator from BibTeX to RDF which allows to do exactly that.
"This is one of the most intellectually challenging programming books that I have ever read...I strongly recommend that all Java programmers read this book."
If you've spent any time doing Java programming with Eclipse it must have occurred to you that support for viewing and editing Jar files is a little limited. Having used Eclipse for over eighteen months, and since I hadn't yet built an Eclipse plugin, I decided to dive right in and build a viewer/editor that would let me stop using File Explorer or WinZip(1) for manipulating the Jar files in my projects. Hopefully forever.
Five days later here it is: JarPlug, the Java ARchive PLUGin for Eclipse (sorry... :). And what days: going up the learning curve of Eclipse plugin internals and trying to figure out a workflow paradigm for editing Jar files that made sense inside the Eclipse IDE. More on that later.
ZXing (pronounced "zebra crossing") is an open-source, multi-format 1D/2D barcode image processing library implemented in Java. Our focus is on using the built-in camera on mobile phones to photograph and decode barcodes on the device, without communicating with a server. We currently have support for:
* UPC-A and UPC-E
* EAN-8 and EAN-13
* Code 39
* Code 128
* QR Code
* ITF
* RSS-14 (Stacked and Limited)
* Data Matrix ('alpha' quality)
* PDF 417 ('alpha' quality)
SourceForge presents the Amazon Mechanical Turk SDK for Java project. Amazon Mechanical Turk SDK for Java is an open source application. SourceForge provides the world's largest selection of Open Source Software. The Amazon Mechanical Turk SDK for Java is a set of libraries and tools designed to make it easier for you to build solutions leveraging Amazon Mechanical Turk (MTurk, MechTurk).