This is the project page for SecondString, an open-source Java-based package of approximate string-matching techniques. This code was developed by researchers at Carnegie Mellon University from the Center for Automated Learning and Discovery, the Department of Statistics, and the Center for Computer and Communications Security.
SecondString is intended primarily for researchers in information integration and other scientists. It does or will include a range of string-matching methods from a variety of communities, including statistics, artificial intelligence, information retrieval, and databases. It also includes tools for systematically evaluating performance on test data. It is not designed for use on very large data sets.
Finding important information in unstructured text
From Language and Information Technologies
Jump to: navigation, search
A vast majority of the information we deal with in everyday life consists of raw, unstructured text, where the most important facts or concepts are not always readily available, but hidden in the myriad of details that accompany them. To handle and digest the sheer amount of information we are exposed to in this information age, more sophisticated procedures are required to unveil the important parts of a text, and to allow us to process more information in less time. The goal of this project is to develop robust and accurate techniques to automatically extract important information from unstructured text, in the form of keyphrases (keyphrase extraction) or entire sentences (extractive summarization).
Funded by Google
[edit]
Naboj is a dynamical website that lets you review online scientific articles. Right now the only articles that are available for review are those that have been posted at Los Alamos arXiv.
C. Bretschneider, S. Zillner, M. Hammon, P. Gass, and D. Sonntag. 30th IEEE International Symposium on Computer-Based Medical Systems, CBMS 2017, Thessaloniki, Greece, June 22--24, 2017, page 213--218. Los Alamitos, CA, IEEE Computer Society, (2017)