group :: kdtm | BibSonomy

bookmarks (hide)176
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1Sentiment Analysis in Financial News
publications?
16 years ago by @hkorte
show all tags
text_mining
sentiment_analysis
my_topic
text_miningsentiment_analysismy_topic
copydelete
- community post
- history of this post
1TüBa-D/Z
Die Tübinger Baumbank des Deutschen / Schriftsprache (TüBa-D/Z) ist ein syntaktisch annotiertes Korpus auf der Grundlage der Zeitung "die tageszeitung" (taz). Sie umfasst zur Zeit ca. 36 000 Sätze bzw. 630 000 Worte.
16 years ago by @hkorte
show all tags
nlp
corpus
treebank
nlpcorpustreebank
copydelete
- community post
- history of this post
1Introduction to Syntactic Parsing
An overview on 91 slides of syntactic parsers using PCFGs
16 years ago by @hkorte
show all tags
nlp
linguistics
parser
nlplinguisticsparser
copydelete
- community post
- history of this post
14OpenCyc.org
OpenCyc is the open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine.
16 years ago by @hkorte
show all tags
nlp
rdf
opensource
ontology
nlprdfopensourceontology
copydelete
- community post
- history of this post
1Agenda Setting Prozesse (pdf)
http://epub.ub.uni-muenchen.de/734/1/AgendaSettingProzesse.pdf
16 years ago by @hkorte
show all tags
mewi
mewi
copydelete
- community post
- history of this post
1SVM-JAVA: A Java implementation of SMO
SVM-JAVA, developed for research and educational purpose, is a Java implementation of John C. Platt's sequential minimal optimization (SMO) for training a support vector machine (SVM). This program is based on the pseudocode in "Fast Training of Support Vector Machines using Sequential Minimal Optimization" by John C. Platt and in "Sequential Minimal Optimization for SVM" by Xianping Ge. It currently supports linear and RBF kernels.
16 years ago by @hkorte
show all tags
java
svm
tools
programming
javasvmtoolsprogramming
copydelete
- community post
- history of this post
1Install script for nice Euro signs in LaTeX
http://www.ctan.org/tex-archive/fonts/euro/
15 years ago by @hkorte
show all tags
latex
latex
copydelete
- community post
- history of this post
2ICEpdf
ICEpdf is an open source Java PDF library ideal for displaying and printing PDF documents within any Java application.
15 years ago by @hkorte
show all tags
java
pdf
pdfrenderer
api
javapdfpdfrendererapi
copydelete
- community post
- history of this post
3Headache relief for programmers - Regular Expression Generator
Helps to build a Regex based on an example string
15 years ago by @hkorte
show all tags
regex_generation
regex
tools
regex_generationregextools
copydelete
- community post
- history of this post
2Jericho HTML Parser
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
15 years ago by @hkorte
show all tags
java
parser
opensource
tools
javaparseropensourcetools
copydelete
- community post
- history of this post
1Word Document Text Extractor
This java class extracts the text from a Word 6.0/95/97/2000/XP word document.
15 years ago by @hkorte
show all tags
java
text_extraction
tools
javatext_extractiontools
copydelete
- community post
- history of this post
1Extract RSS feeds from Web pages
Approach to convert any Web data into RSS format.
15 years ago by @hkorte
show all tags
rss
web_article_extraction
www
information_extraction
crawling
tools
C#
rssweb_article_extractionwwwinformation_extractioncrawlingtoolsC#
copydelete
- community post
- history of this post
1BigTable: Google’s Distributed Data Store
http://hnr.dnsalias.net/wordpress/2008/10/bigtable-googles-distributed-data-store/
15 years ago by @hkorte
show all tags
to_read
to_read
copydelete
- community post
- history of this post
11iText, a Free Java-PDF Library
iText is a library that allows you to generate PDF files on the fly.
15 years ago by @hkorte
show all tags
java
pdf
library
tools
javapdflibrarytools
copydelete
- community post
- history of this post
2CoNLL-2005 Shared Task: Semantic Role Labeling
http://www.lsi.upc.edu/~srlconll/
15 years ago by @hkorte
show all tags
data_source
semantic_role_labeling
data_sourcesemantic_role_labeling
copydelete
- community post
- history of this post
1Ext GWT - Java Component Library
Ext GWT: Rich Internet Application Framework for GWT.
15 years ago by @hkorte
show all tags
java
ria
google_web_toolkit
opensource
javascript
programming
javariagoogle_web_toolkitopensourcejavascriptprogramming
copydelete
- community post
- history of this post
5Open Source Web Crawlers Written in Java
http://www.manageability.org/blog/stuff/open-source-web-crawlers-java
15 years ago by @hkorte
show all tags
java
crawling
tools
javacrawlingtools
copydelete
- community post
- history of this post
1Watij - Web Application Testing in Java
Watij (pronounced wattage) stands for Web Application Testing in Java. It is a pure Java API created to allow for the automation of web applications.
15 years ago by @hkorte
show all tags
java
www
tools
javawwwtools
copydelete
- community post
- history of this post
10andLinux.org
andLinux runs Linux natively inside Windows. It is a complete Ubuntu Linux system running seamlessly in Windows 2000 based systems (2000, XP, 2003, Vista, 7; 32-bit versions only).
14 years ago by @hkorte
show all tags
opensource
virtual_machine
opensourcevirtual_machine
copydelete
- community post
- history of this post
4Apache Wicket
With proper mark-up/logic separation, a POJO data model, and a refreshing lack of XML, Apache Wicket makes developing web-apps simple and enjoyable again.
14 years ago by @hkorte
show all tags
java
web
javaweb
copydelete
- community post
- history of this post
19Django | The Web framework for perfectionists with deadlines
Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
13 years ago by @hkorte
show all tags
framework
web
python
frameworkwebpython
copydelete
- community post
- history of this post
3RequireJS
RequireJS is a JavaScript file and module loader. It is optimized for in-browser use, but it can be used in other JavaScript environments, like Rhino and Node. Using a modular script loader like RequireJS will improve the speed and quality of your code.
13 years ago by @hkorte
show all tags
library
webapps
javascript
librarywebappsjavascript
copydelete
- community post
- history of this post
6Jade - Java Agent DEvelopment Framework
JADE (Java Agent DEvelopment Framework) is a software Framework fully implemented in Java language. It simplifies the implementation of multi-agent systems through a middle-ware that complies with the FIPA specifications and through a set of graphical tools that supports the debugging and deployment phases
13 years ago by @hkorte
show all tags
java
library
aose
agent
p2p
javalibraryaoseagentp2p
copydelete
- community post
- history of this post
6Think Stats - Probability and Statistics for Programmers
Think Stats is an introduction to Probability and Statistics for Python programmers. It is completely available online in an HTML and a PDF version.
13 years ago by @hkorte
show all tags
book
creative_commons
statistics
bookcreative_commonsstatistics
copydelete
- community post
- history of this post
5Programming, Motherfucker - Do you speak it?
We are a community of motherfucking programmers who have been humiliated by software development methodologies for years. We are tired of XP, Scrum, Kanban, Waterfall, Software Craftsmanship (aka XP-Lite) and anything else getting in the way of...Programming, Motherfucker.
13 years ago by @hkorte
show all tags
programming
programming
copydelete
- community post
- history of this post
1Dispatch
Dispatch is a library for asynchronous HTTP interaction. It provides a Scala vocabulary for Java’s async-http-client.
12 years ago by @hkorte
show all tags
scala
http_client
scalahttp_client
copydelete
- community post
- history of this post
6Paper.js
Scriptographer ported to JavaScript and the browser, using HTML5 Canvas.
12 years ago by @hkorte
show all tags
library
free_licence
graphics
canvas
javascript
libraryfree_licencegraphicscanvasjavascript
copydelete
- community post
- history of this post
7PDF Split and Merge
Split and merge pdf documents with pdfsam, it’s free and open source.
12 years ago by @hkorte
show all tags
pdf_editing
java
pdf
tool
pdf_editingjavapdftool
copydelete
- community post
- history of this post
1HPPC: High Performance Primitive Collections for Java
Carrot Search Labs: High Performance Primitive Collections for Java, JUnit benchmarking, Suffix Arrays for Java, CSS sprites
11 years ago by @hkorte
show all tags
java
collections
performance
tools
javacollectionsperformancetools
copydelete
- community post
- history of this post
1ACL Anthology
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics
16 years ago by @hkorte
show all tags
linguistics
link_list
linguisticslink_list
copydelete
- community post
- history of this post
6SentiWordNet
SentiWordNet is a lexical resource for opinion mining. SentiWordNet assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity.
16 years ago by @hkorte
show all tags
nlp
text_mining
sentiment_analysis
nlptext_miningsentiment_analysis
copydelete
- community post
- history of this post
2enunciate
Enunciate is a Web service deployment framework. It is not another Web service stack implementation. Rather, Enunciate leverages existing Web service technologies to provide a mechanism to build, package, deploy, and to clearly, accurately deliver your Web service API on the Java platform.
17 years ago by @hkorte
show all tags
framework
java
webservices
amf
programming
frameworkjavawebservicesamfprogramming
copydelete
- community post
- history of this post
2SALSA Project - Uni Saarland
The Saarbrücken Lexical Semantics Acquisition Project
16 years ago by @hkorte
show all tags
linguistic
text_mining
sentiment_analysis
linguistictext_miningsentiment_analysis
copydelete
- community post
- history of this post
1TreeTagger - output visualisation module
Online Demo of the TreeTagger. A tool for annotating text with part-of-speech and lemma information which has been developed at the Institute for Computational Linguistics of the University of Stuttgart.
16 years ago by @hkorte
show all tags
nlp
linguistics
pos
text_mining
parser
visualisation
nlplinguisticspostext_miningparservisualisation
copydelete
- community post
- history of this post
1Shalmaneser Project Home
Shalmaneser is a supervised learning toolbox for shallow semantic parsing, i.e. the automatic assignment of semantic classes and roles to text. The system was developed for Frame Semantics; thus we use Frame Semantics terminology and call the classes frames and the roles frame elements. However, the architecture is reasonably general, and with a certain amount of adaption, Shalmaneser should be usable for other paradigms (e.g., PropBank roles) as well. Shalmaneser caters both for end users, and for researchers.
16 years ago by @hkorte
show all tags
nlp
linguistics
pos
text_mining
parser
nlplinguisticspostext_miningparser
copydelete
- community post
- history of this post
19Introduction to Information Retrieval
http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html
16 years ago by @hkorte
show all tags
book
information_retrieval
bookinformation_retrieval
copydelete
- community post
- history of this post
1FTD - Munzinger Archiv
Die Personen-Datenbank des Munzinger-Archivs umfasst mehr als 20.000 prominente Lebensläufe und wird kontinuierlich aktualisiert. Sie finden dort Porträts von Politikern, Wirtschaftsgrößen, aber auch von Künstlern und Wissenschaftlern.
16 years ago by @hkorte
show all tags
data_source
named_entity_recognition
finance
data_sourcenamed_entity_recognitionfinance
copydelete
- community post
- history of this post
1PDFjam
PDFjam is a small collection of shell scripts which provide a simple interface to some of the functionality of the excellent pdfpages package (by Andreas Matthias) for pdfLaTeX.
16 years ago by @hkorte
show all tags
pdf
tools
cli
pdftoolscli
copydelete
- community post
- history of this post
2Linksammlung: Freie NLP-Software für die deutsche Sprache
Alle Programme und Resourcen auf der Liste sind frei, d.h. kostenlos (für Forschungszwecke) verfügbar, auf deutschsprachige Texte anwendbar und sofort startklar, d.h. sie müssen nicht erst mit Hilfe von z.B. annotierten Korpora trainiert werden. Die Liste ist natürlich unvollständig (Stand 22.5.2007).
16 years ago by @hkorte
show all tags
nlp
linguistics
tools
nlplinguisticstools
copydelete
- community post
- history of this post
1Lecture Slides: Parsing German
Expressive Grammars for Natural Language Processing: Theory and applications - Today's lecture: Parsing German
16 years ago by @hkorte
show all tags
nlp
linguistics
parser
semantic_role_labeling
parsing_german
nlplinguisticsparsersemantic_role_labelingparsing_german
copydelete
- community post
- history of this post
1MSTParser
MSTParser is a non-projective dependency parser that searches for maximum spanning trees over directed graphs. Models of dependency structure are based on large-margin discriminative training methods. Projective parsing is also supported.
16 years ago by @hkorte
show all tags
java
parser
tools
javaparsertools
copydelete
- community post
- history of this post
1TIGER API 1.8 - A Java interface to the TIGER corpus
TIGER API is a library which allows Java programmers to easily access the structure of any corpus given as a TIGER-XML file. It can process the TIGER corpus and any other corpus encoded in TIGER-XML. The underlying API specifies a Java object model for corpora encoded in TIGER-XML and provides methods for traversing syntax trees and accessing elements such as sentences, syntax graph nodes, and their attributes.
16 years ago by @hkorte
show all tags
java
nlp
corpus
tools
tiger
javanlpcorpustoolstiger
copydelete
- community post
- history of this post
1SemEval 2007: Tasks
Tasks and data of SemEval 2007
16 years ago by @hkorte
show all tags
nlp
conference
relation_extraction
nlpconferencerelation_extraction
copydelete
- community post
- history of this post
5The OpenNLP Homepage
OpenNLP is an organizational center for open source projects related to natural language processing. It hosts a variety of java-based NLP tools which perform sentence detection, tokenization, pos-tagging, chunking and parsing, named-entity detection, and coreference using the OpenNLP Maxent machine learning package.
16 years ago by @hkorte
show all tags
java
nlp
pos
text_mining
parser
opensource
tools
javanlppostext_miningparseropensourcetools
copydelete
- community post
- history of this post
1natural language processing blog: F-measure versus Accuracy
F-measure versus Accuracy
16 years ago by @hkorte
show all tags
evaluation
information_retrieval
statistics
blog
evaluationinformation_retrievalstatisticsblog
copydelete
- community post
- history of this post
1Octave-Tutorial (PDF)
http://www.amm.mw.tum.de/fileadmin/Image_Archive/Lehre/Prakt_MKS/Tutorial.pdf
16 years ago by @hkorte
show all tags
tutorial
octave
tutorialoctave
copydelete
- community post
- history of this post
2ACE - Automatic Content Extraction
The objective of the ACE Program is to develop extraction technology to support automatic processing of source language data (in the form of natural text, and as text derived from ASR and OCR). This includes classification, filtering, and selection based on the language content of the source data, i.e., based on the meaning conveyed by the data. Thus the ACE program requires the development of technologies that automatically detect and characterize this meaning. The ACE research objectives are viewed as the detection and characterization of Entities, Relations, and Events.
16 years ago by @hkorte
show all tags
nlp
data_source
relation_extraction
nlpdata_sourcerelation_extraction
copydelete
- community post
- history of this post
37Project Gutenberg
The first producer of free electronic books
16 years ago by @hkorte
show all tags
ebooks
free
ebooksfree
copydelete
- community post
- history of this post
2Grails Framework Reference Documentation
http://grails.org/doc/1.0.x/
16 years ago by @hkorte
show all tags
groovy
docs
grails
groovydocsgrails
copydelete
- community post
- history of this post
5JNI Kernel Extension for SVMlight
This software is an extension of the SVMlight software. It provides an interface to kernel functions that are implemented in Java by means of the Java Native Interface (JNI) Invocation API.
16 years ago by @hkorte
show all tags
java
svmlight
svm
kernels
tools
programming
javasvmlightsvmkernelstoolsprogramming
copydelete
- community post
- history of this post
1RelEx Semantic Relationship Extractor - OpenCog
RelEx, a narrow-AI component of OpenCog, is an English-language semantic relationship extractor, built on the Carnegie-Mellon link parser. It can identify subject, object, indirect object and many other dependency relationships between words in a sentence; it generates dependency trees, resembling those of dependency grammars.
15 years ago by @hkorte
show all tags
relation_extraction
opensource
tools
relation_extractionopensourcetools
copydelete
- community post
- history of this post
4Freebase Wikipedia Extraction (WEX)
The Freebase Wikipedia Extraction (WEX) is a processed dump of the English language Wikipedia.
15 years ago by @hkorte
show all tags
information_extraction
opensource
wikipedia
tools
information_extractionopensourcewikipediatools
copydelete
- community post
- history of this post
3xyling - LaTeX macros for linguistic graphics
http://www.ling.uni-potsdam.de/~rvogel/xyling/
15 years ago by @hkorte
show all tags
linguistics
trees
latex
linguisticstreeslatex
copydelete
- community post
- history of this post
1LaTeX: Packages in the ‘graphics’ bundle
User-manual for the packages color, graphics, and graphicx.
15 years ago by @hkorte
show all tags
latex
latex
copydelete
- community post
- history of this post
3Java Open Source NLP and Text Mining tools
This is an overview of the open source NLP and machine learning tools for text mining, information extraction, text classification, clustering, approximate string matching, language parsing and tagging, and more.
15 years ago by @hkorte
show all tags
nlp
tools
textmining
nlptoolstextmining
copydelete
- community post
- history of this post
1JBoss Cache as a POJO Cache
Tutorial on JBossCache POJO.
15 years ago by @hkorte
show all tags
java
db_collections
jboss
tutorial
pojo_cache
tools
javadb_collectionsjbosstutorialpojo_cachetools
copydelete
- community post
- history of this post
3Apache POI - Java API To Access Microsoft Format Files
The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format, and Office OpenXML format, using pure Java. In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java.
15 years ago by @hkorte
show all tags
java
library
tools
javalibrarytools
copydelete
- community post
- history of this post
1GNU Bash Reference Manual
This text is a brief description of the features that are present in the Bash shell.
15 years ago by @hkorte
show all tags
reference
bash
manual
referencebashmanual
copydelete
- community post
- history of this post
1Cleaneval development dataset
CLEANEVAL is a shared task and competitive evaluation on the topic of cleaning arbitrary web pages, with the goal of preparing web data for use as a corpus, for linguistic and language technology research and development.
15 years ago by @hkorte
show all tags
data_source
corpus
eval_corpus
html2text
data_sourcecorpuseval_corpushtml2text
copydelete
- community post
- history of this post
2TreeTagger for Java
TreeTagger for Java is a Java wrapper around the popular TreeTagger package by Helmut Schmid.
15 years ago by @hkorte
show all tags
java
pos
tools
javapostools
copydelete
- community post
- history of this post
210 open source books worth downloading
http://www.tectonic.co.za/?p=4491
15 years ago by @hkorte
show all tags
book
opensource
bookopensource
copydelete
- community post
- history of this post
6The On-Line Encyclopedia of Integer Sequences
The On-Line Encyclopedia of Integer Sequences (OEIS), also cited simply as Sloane's, is an extensive searchable database of integer sequences, freely available on the Web.
15 years ago by @hkorte
show all tags
database
integer
sequences
databaseintegersequences
copydelete
- community post
- history of this post
1Cohen's Kappa for more than two annotators and multiple classes
Online Calculator for Cohen's Kappa
15 years ago by @hkorte
show all tags
online
tools
statistics
onlinetoolsstatistics
copydelete
- community post
- history of this post
1Mozilla Labs Jetpack
API for creating Firefox add-ons with HTML, CSS and JavaScript.
15 years ago by @hkorte
show all tags
API
javascript
programming
APIjavascriptprogramming
copydelete
- community post
- history of this post
1Das Fußball Studio
Das Fußball Studio ist eine Freeware, mit der Fussball-Ligen und -Turniere verwaltet und ausgewertet werden können. Dazu die Bundesliga-Datenbank mit vollständigen Daten der 1. und 2. Bundesliga.
15 years ago by @hkorte
show all tags
data_source
sports_betting
data_sourcesports_betting
copydelete
- community post
- history of this post
1Bundesliga - Die offizielle Webseite
Offizielle Fußball-Statistiken
15 years ago by @hkorte
show all tags
data_source
sports_betting
data_sourcesports_betting
copydelete
- community post
- history of this post
2The Road Runner Project
Towards Automatic Data Extraction from Large Web Sites
15 years ago by @hkorte
show all tags
java
regex
www
information_extraction
crawling
javaregexwwwinformation_extractioncrawling
copydelete
- community post
- history of this post
2reWork: a regular expression workbench
Helps to test regular expressions
15 years ago by @hkorte
show all tags
regex
tools
regextools
copydelete
- community post
- history of this post
1AKBC - First Workshop on Automated Knowledge Base Construction
This workshop will gather researchers in a variety of fields that contribute to the automated construction of knowledge bases. It will be held at Xerox Research Centre Europe, near Grenoble (France), May 17-19, 2010.
14 years ago by @hkorte
show all tags
knowledge_base_population
conference
workshop
knowledge_base_populationconferenceworkshop
copydelete
- community post
- history of this post
1OntoLT - Middleware for Ontology Extraction from Text
The OntoLT approach aims at a more direct connection between ontology engineering and linguistic analysis. OntoLT is a Protégé plug-in, with which concepts (Protégé classes) and relations (Protégé slots) can be extracted automatically from linguistically annotated text collections. It provides mapping rules, defined by use of a precondition language that allow for a mapping between linguistic entities in text and class/slot candidates in Protégé.
14 years ago by @hkorte
show all tags
java
knowledge_base_population
opensource
tools
ontology
javaknowledge_base_populationopensourcetoolsontology
copydelete
- community post
- history of this post
5OLP3: 3rd Workshop on Ontology Learning and Population
http://olp.dfki.de/olp3/
14 years ago by @hkorte
show all tags
knowledge_base_population
workshop
ontology
knowledge_base_populationworkshopontology
copydelete
- community post
- history of this post
3MLComp
MLcomp is a free website for objectively comparing machine learning programs across various datasets for multiple problem domains.
14 years ago by @hkorte
show all tags
comparison
machine_learning
tools
comparisonmachine_learningtools
copydelete
- community post
- history of this post
1Text selector jquery plugin
http://perplexed.co.uk/1020_text_selector_jquery_plugin.htm
14 years ago by @hkorte
show all tags
web
tools
jquery
javascript
programming
webtoolsjqueryjavascriptprogramming
copydelete
- community post
- history of this post
1T-Verteilung
http://psydok.sulb.uni-saarland.de/volltexte/2004/268/html/tvert.htm
14 years ago by @hkorte
show all tags
statistics
statistics
copydelete
- community post
- history of this post
4Z-Type
Type To Shoot Game
13 years ago by @hkorte
show all tags
game
game
copydelete
- community post
- history of this post
1Unsupervised Semantic Parsing Source Code
Source code to repeat the paper evaluation: We present the first unsupervised approach to the problem of learning a semantic parser, using Markov logic. Our USP system transforms dependency trees into quasi-logical forms, recursively induces lambda forms from these, and clusters them to abstract away syntactic variations of the same meaning. The MAP semantic parse of a sentence is obtained by recursively assigning its parts to lambda-form clusters and composing them. We evaluate our approach by using it to extract a knowledge base from biomedical abstracts and answer questions. USP substantially outperforms TextRunner, DIRT and an informed baseline on both precision and recall on this task.
13 years ago by @hkorte
show all tags
knowledge_base_population
nlp
unsupervised
dependency_trees
markov_logic
tools
knowledge_base_populationnlpunsuperviseddependency_treesmarkov_logictools
copydelete
- community post
- history of this post
3Tangle: a JavaScript library for reactive documents
Tangle is a JavaScript library for creating reactive documents. Your readers can interactively explore possibilities, play with parameters, and see the document update immediately. Tangle is super-simple and easy to learn.
12 years ago by @hkorte
show all tags
library
free_licence
math
visualisation
javascript
libraryfree_licencemathvisualisationjavascript
copydelete
- community post
- history of this post
1TregexPattern (Stanford JavaNLP API)
Scans trees for a given node pattern
16 years ago by @hkorte
show all tags
java
trees
programming
javatreesprogramming
copydelete
- community post
- history of this post
35OpenThesaurus - Deutscher Thesaurus - Synonyme und Assoziationen
OpenThesaurus ist ein Open-Source-Thesaurus für die deutsche Sprache. Jeder kann mitmachen und die Einträge verbessern.
16 years ago by @hkorte
show all tags
nlp
linguistics
thesaurus
opensource
nlplinguisticsthesaurusopensource
copydelete
- community post
- history of this post
2ASV Toolbox
ASV Toolbox is a modular collection of tools for the exploration of written language data. They work either on word lists or text and solve several linguistic classification and clustering tasks. The topics covered contain language detection, POS-tagging, base form reduction, named entity recognition, and terminology extraction.
16 years ago by @hkorte
show all tags
java
nlp
linguistics
pos
text_mining
tools
javanlplinguisticspostext_miningtools
copydelete
- community post
- history of this post
1Introduction to Semantic MediaWiki
Semantic MediaWiki (SMW) is a free extension of MediaWiki that helps to search, organise, tag, browse, evaluate, and share the wiki's content. While traditional wikis contain only texts which computers can neither understand nor evaluate, SMW adds semantic annotations that bring the power of the Semantic Web to the wiki.
16 years ago by @hkorte
show all tags
rdf
relation_extraction
wiki
information_extraction
semantics
rdfrelation_extractionwikiinformation_extractionsemantics
copydelete
- community post
- history of this post
3Movie Review Data
Movie review data: test set for sentiment analysis
16 years ago by @hkorte
show all tags
nlp
data_source
sentiment_analysis
corpus
opinion_mining
nlpdata_sourcesentiment_analysiscorpusopinion_mining
copydelete
- community post
- history of this post
1Bayesian Support Vector Machine Hyperparameter Tuning
Software for parameter tuning for SVM classifiers
16 years ago by @hkorte
show all tags
svm_tuning
svm
tools
svm_tuningsvmtools
copydelete
- community post
- history of this post
1SVM: a brief introduction (Presentation)
http://www.cad.zju.edu.cn/home/zhx/ML/ML2008-SVM.pdf
16 years ago by @hkorte
show all tags
overview
svm
overviewsvm
copydelete
- community post
- history of this post
4Learning with Kernels - Support Vector Machines, Regularization, Optimization and Beyond
This web page provides information, errata, as well as about a third of the chapters of the book Learning with Kernels, written by Bernhard Schölkopf and Alex Smola (MIT Press, Cambridge, MA, 2002).
16 years ago by @hkorte
show all tags
book
svm
kernels
booksvmkernels
copydelete
- community post
- history of this post
1Opinion Mining: A list of bibtex entries
Looks like an interesting and huge collection of opinion mining / sentiment classification papers
15 years ago by @hkorte
show all tags
nlp
sentiment_analysis
opinion_mining
nlpsentiment_analysisopinion_mining
copydelete
- community post
- history of this post
1Detecting Known and New Salting Tricks in Unwanted Emails
We have developed a systems that enables the detection of certain common salting tricks that are employed by criminals. Salting is the intentional addition or distortion of content. In this paper we describe a framework to identify email messages that might contain new, previously unseen tricks. To this end, we compare the simulated perceived email message text generated by our hidden salting simulation system to the OCRed text we obtain from the rendered email message. We present robust text comparison techniques and train a classifier based on the differences of these two texts. In simulations we show that we can detect suspicious emails with a high level of accuracy.
15 years ago by @paass
show all tags
phishing
email
security
filtering
phishingemailsecurityfiltering
copydelete
- community post
- history of this post
1Improved Phishing Detection using Model-Based Features
We investigate the statistical filtering of phishing emails, where a classifier is trained on characteristic features of existing emails and subsequently is able to identify new phishing emails with different contents. We propose advanced email features generated by adaptively trained Dynamic Markov Chains and by novel latent Class-Topic Models. On a publicly available test corpus classifiers using these features are able to reduce the number of misclassified emails by two thirds compared to previous work. Using a recently proposed more expressive evaluation method we show that these results are statistically significant. In addition we successfully tested our approach on a non-public email corpus with a real-life composition.
15 years ago by @paass
show all tags
mining
phishing
data
email
security
filtering
miningphishingdataemailsecurityfiltering
copydelete
- community post
- history of this post
3Das LATEX2e-Sündenregister
Veraltete Befehle, Pakete und andere Fehler
15 years ago by @hkorte
show all tags
latex
latex
copydelete
- community post
- history of this post
5HTML Parser
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package. It is a fast real-time parser for real-world HTML. What has attracted most developers to HTMLParser has been its simplicity in design, speed and ability to handle streaming real-world html.
15 years ago by @hkorte
show all tags
java
opensource
tools
programming
javaopensourcetoolsprogramming
copydelete
- community post
- history of this post
1Webstemmer
Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation links mixed up
15 years ago by @hkorte
show all tags
python
web_article_extraction
www
information_extraction
crawling
tools
pythonweb_article_extractionwwwinformation_extractioncrawlingtools
copydelete
- community post
- history of this post
2SimMetrics
SimMetrics is a Similarity Metric Library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro etc) to other metrics, (e.g Soundex, Chapman).
15 years ago by @hkorte
show all tags
java
nlp
tools
string_similarity
javanlptoolsstring_similarity
copydelete
- community post
- history of this post
7Joda Time - Java date and time API
Joda-Time provides a quality replacement for the Java date and time classes. The design allows for multiple calendar systems, while still providing a simple API. The 'default' calendar is the ISO8601 standard which is used by XML. The Gregorian, Julian, Buddhist, Coptic, Ethiopic and Islamic systems are also included, and we welcome further additions. Supporting classes include time zone, duration, format and parsing.
15 years ago by @hkorte
show all tags
java
library
tools
javalibrarytools
copydelete
- community post
- history of this post
1JBoss PojoCache Tutorial
PojoCache is an in-memory, transactional, and replicated POJO (plain old Java object) cache system that allows users to operate on a POJO transparently without active user management of either replication or persistency aspects. This tutorial focuses on the usage of the PojoCache API.
15 years ago by @hkorte
show all tags
java
db_collections
tutorials
jboss
pojo_cache
javadb_collectionstutorialsjbosspojo_cache
copydelete
- community post
- history of this post
1GWT API's for SmartClient
SmartGWT is a GWT based framework that allows you to not only utilize its comprehensive widget library for your application UI, but also tie these widgets in with your server-side for data management. SmartGWT is based on the powerful and mature SmartClient library.
15 years ago by @hkorte
show all tags
java
google_web_toolkit
tools
license_lgpl
javascript
programming
javagoogle_web_toolkittoolslicense_lgpljavascriptprogramming
copydelete
- community post
- history of this post
1Cibyl
Cibyl is a programming environment and binary translator that allows compiled C programs to execute on J2ME-capable phones. Cibyl uses GCC to compile the C programs to MIPS binaries, and these are then recompiled into Java bytecode.
15 years ago by @hkorte
show all tags
c-to-java-translator
tools
c-to-java-translatortools
copydelete
- community post
- history of this post
4ConceptNet
ConceptNet represents data in the form of a semantic network, and makes it available to be used in natural language processing and intelligent user interfaces.
15 years ago by @hkorte
show all tags
nlp
corpus
WordNet
ontology
nlpcorpusWordNetontology
copydelete
- community post
- history of this post
33The Protégé Ontology Editor and Knowledge Acquisition System
Protégé is a free, open source ontology editor and knowledge-base framework. The Protégé platform supports two main ways of modeling ontologies via the Protégé-Frames and Protégé-OWL editors. Protégé ontologies can be exported into a variety of formats including RDF(S), OWL, and XML Schema. Protégé is based on Java, is extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development.
14 years ago by @hkorte
show all tags
editor
java
opensource
tools
ontology
editorjavaopensourcetoolsontology
copydelete
- community post
- history of this post
1Polylingual Topic Models
Input: parallel corpora = translated documents A very interesting type of topic multilingual model without need to align words in parallel corpora.
14 years ago by @paass
show all tags
parallel
language
topic
model
corpora
multi
parallellanguagetopicmodelcorporamulti
copydelete
- community post
- history of this post
1nodeunit - Unit testing in node.js
Node.js provides a its own assert module with some really useful functions for creating basic tests. However, the reporting and running of these assertions can become complicated, especially with asynchronous code. How can you be sure that all assertions ran? Or that they ran in the correct order? This is where nodeunit comes in, a tool for defining and running unit tests in the simplest way possible.
14 years ago by @hkorte
show all tags
node.js
unit_testing
tools
javascript
programming
node.jsunit_testingtoolsjavascriptprogramming
copydelete
- community post
- history of this post

⟨⟨
⟨
1
2
⟩
⟩⟩

publications (hide)214
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

1Learning the Kernel with Hyperkernels
C. Ong, A. Smola, R. Williamson, and R. Herbrich. Journal of Machine Learning Research, (2005)
16 years ago by @hkorte
show all tags
svm
kernels
svmkernels
copydeleteadd this publication to your clipboard
14Support--Vector Networks
C. Cortes, and V. Vapnik. Machine Learning, (1995)
15 years ago by @hkorte
show all tags
machine_learning
svm
kernels
machine_learningsvmkernels
copydeleteadd this publication to your clipboard
6The Proposition Bank: An Annotated Corpus of Semantic Roles
M. Palmer, D. Gildea, and P. Kingsbury. Computational Linguistics, 31 (1): 71-105 (2005)
15 years ago by @hkorte
show all tags
nlp
data_source
corpus
english
nlpdata_sourcecorpusenglish
copydeleteadd this publication to your clipboard
11Stacked Generalization
D. Wolpert. Neural Networks, 5 (2): 241--259 (1992)
15 years ago by @hkorte
show all tags
machine_learning
stacking
machine_learningstacking
copydeleteadd this publication to your clipboard
2Dynamic rating of sports teams
L. Knorr-Held. THE STATISTICIAN, (2000)
15 years ago by @hkorte
show all tags
sports_betting
sports_betting
copydeleteadd this publication to your clipboard
2Text Mining Systems for Market Response to News: A Survey
M. Mittermayer, and G. Knolmayer. (2006)
16 years ago by @hkorte
show all tags
text_mining
survey
text_miningsurvey
copydeleteadd this publication to your clipboard
6A statistical interpretation of term specificity and its application in retrieval
K. Jones. Journal of Documentation, (1972)
16 years ago by @hkorte
show all tags
nlp
text_mining
nlptext_mining
copydeleteadd this publication to your clipboard
10Automatic Labeling of Semantic Roles
D. Gildea, and D. Jurafsky. Computational Linguistics, 28 (3): 245--288 (September 2002)
15 years ago by @hkorte
show all tags
nlp
semantic_role_labeling
nlpsemantic_role_labeling
copydeleteadd this publication to your clipboard
16Unsupervised named-entity extraction from the Web: An experimental study
O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. Artificial Intelligence, 165 (1): 91 - 134 (2005)
15 years ago by @hkorte
show all tags
unsupervised
www
named_entity_recognition
unsupervisedwwwnamed_entity_recognition
copydeleteadd this publication to your clipboard
8Kernel methods for relation extraction
D. Zelenko, C. Aone, and A. Richardella. J. Mach. Learn. Res., (2003)
16 years ago by @hkorte
show all tags
relation_extraction
kernels
relation_extractionkernels
copydeleteadd this publication to your clipboard
11An Evaluation of Statistical Approaches to Text Categorization
Y. Yang. Information Retrieval, 1 (1--2): 69--90 (1999)
16 years ago by @hkorte
show all tags
text_classification
information_retrieval
statistics
text_classificationinformation_retrievalstatistics
copydeleteadd this publication to your clipboard
2Evaluating a news-aware quantitative trader: The effect of momentum and contrarian stock selection strategies
R. Schumaker, and H. Chen. Journal of the American Society for Information Science and Technology, 59 (2): 247--255 (2008)
16 years ago by @hkorte
show all tags
text_mining
my_topic
text_miningmy_topic
copydeleteadd this publication to your clipboard
8Text Classification using String Kernels
H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. Journal of Machine Learning Research, (2002)
16 years ago by @hkorte
show all tags
kernels
kernels
copydeleteadd this publication to your clipboard
2Self-supervised relation extraction from the Web
B. Rosenfeld, and R. Feldman. Knowledge and Information Systems, 17 (1): 17-33 (2008)
14 years ago by @hkorte
show all tags
relation_extraction
unsupervised
www
relation_extractionunsupervisedwww
copydeleteadd this publication to your clipboard
2Word-Sequence Kernels
N. Cancedda, E. Gaussier, C. Goutte, and J. Renders. Journal of Machine Learning Research, (2003)
15 years ago by @hkorte
show all tags
kernels
flat_structured
subsequence
kernelsflat_structuredsubsequence
copydeleteadd this publication to your clipboard
2Evolutionary tuning of multiple SVM parameters
F. Friedrichs, and C. Igel. Neurocomputing, (2004)Trends in Neurocomputing: 12th European Symposium on Artificial Neural Networks 2004.
16 years ago by @hkorte
show all tags
svm_tuning
svm
svm_tuningsvm
copydeleteadd this publication to your clipboard
5Improving Generalization with Active Learning
D. Cohn, L. Atlas, and R. Ladner. Machine Learning, 15 (2): 201-221 (1994)
15 years ago by @hkorte
show all tags
machine_learning
active_learning
machine_learningactive_learning
copydeleteadd this publication to your clipboard
3Large Scale Multiple Kernel Learning
S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf. Journal of Machine Learning Research, (2006)
15 years ago by @hkorte
show all tags
svm
kernels
svmkernels
copydeleteadd this publication to your clipboard
10Large Margin Methods for Structured and Interdependent Output Variables
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Journal of Machine Learning Research, (2005)
15 years ago by @hkorte
show all tags
svm
structured_output
svmstructured_output
copydeleteadd this publication to your clipboard
3The Entire Regularization Path for the Support Vector Machine.
T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu. Journal of Machine Learning Research, (2004)
13 years ago by @hkorte
show all tags
svm_tuning
svm
svm_tuningsvm
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
2
3
⟩
⟩⟩

bookmarks (hide)176 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide)214 displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

KD Text Mining

discussion

tags

bookmarks (hide)176
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)214
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...