group :: kdtm | BibSonomy

bookmarks (hide)176
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1HPPC: High Performance Primitive Collections for Java
Carrot Search Labs: High Performance Primitive Collections for Java, JUnit benchmarking, Suffix Arrays for Java, CSS sprites
11 years ago by @hkorte
show all tags
java
collections
performance
tools
javacollectionsperformancetools
copydelete
- community post
- history of this post
1The Kiji Project
Build Real-time Big Data Applications on Apache HBase. Open Source. Apache 2.0 Licensed. Gives a natural, intuitive toolkit for predictive modeling with machine learning library.
11 years ago by @hkorte
show all tags
library
free_licence
machine_learning
libraryfree_licencemachine_learning
copydelete
- community post
- history of this post
1Principle Components Analysis in Java without incurring brain damage
http://johnsogg.blogspot.de/2010/06/principle-components-analysis-in-java.html
11 years ago by @hkorte
show all tags
java
implementation
PCA
javaimplementationPCA
copydelete
- community post
- history of this post
6Paper.js
Scriptographer ported to JavaScript and the browser, using HTML5 Canvas.
12 years ago by @hkorte
show all tags
library
free_licence
graphics
canvas
javascript
libraryfree_licencegraphicscanvasjavascript
copydelete
- community post
- history of this post
1Dispatch
Dispatch is a library for asynchronous HTTP interaction. It provides a Scala vocabulary for Java’s async-http-client.
12 years ago by @hkorte
show all tags
scala
http_client
scalahttp_client
copydelete
- community post
- history of this post
1jWebSocket - The Open Source Java WebSocket Server
jWebSocket is a pure Java/JavaScript high speed bidirectional communication solution for the Web - secure, reliable and fast. Provides easy integration into existing Tomcat web applications.
12 years ago by @hkorte
show all tags
java
tomcat
web2.0
webapps
websocket
javatomcatweb2.0webappswebsocket
copydelete
- community post
- history of this post
3Tangle: a JavaScript library for reactive documents
Tangle is a JavaScript library for creating reactive documents. Your readers can interactively explore possibilities, play with parameters, and see the document update immediately. Tangle is super-simple and easy to learn.
12 years ago by @hkorte
show all tags
library
free_licence
math
visualisation
javascript
libraryfree_licencemathvisualisationjavascript
copydelete
- community post
- history of this post
7sigma.js - a lightweight JavaScript graph drawing library
sigma.js is an open-source lightweight JavaScript library to draw graphs, using the HTML canvas element. It has been especially designed to display interactively static graphs exported from a graph visualization software like Gephi and to display dynamically graphs that are generated on the fly.
12 years ago by @hkorte
show all tags
library
free_licence
visualization
graph
jquery
javascript
libraryfree_licencevisualizationgraphjqueryjavascript
copydelete
- community post
- history of this post
1Sejda - pdf manipulation layer in java
An extendible and configurable PDF manipulation layer. It is a ready to use java library to perform PDF document manipulation without having to deal with the low level API.
12 years ago by @hkorte
show all tags
pdf_editing
java
library
maven
pdf_editingjavalibrarymaven
copydelete
- community post
- history of this post
7PDF Split and Merge
Split and merge pdf documents with pdfsam, it’s free and open source.
12 years ago by @hkorte
show all tags
pdf_editing
java
pdf
tool
pdf_editingjavapdftool
copydelete
- community post
- history of this post
1Data Extraction, Web Screen Scraping Tool, Mozenda Scraper
The Mozenda Scraper provides web data extraction software, Web Screen Scraping tools that makes it easy to capture nearly any content from the web. See how you can start getting data from the web in minutes.
12 years ago by @hkorte
show all tags
web
scraper
webscraper
copydelete
- community post
- history of this post
3How to Write a Spelling Corrector
An example of a toy spelling corrector that achieves 80 or 90% accuracy at a processing speed of at least 10 words per second in less than a page of python code.
12 years ago by @hkorte
show all tags
python
spelling_correction
pythonspelling_correction
copydelete
- community post
- history of this post
5Programming, Motherfucker - Do you speak it?
We are a community of motherfucking programmers who have been humiliated by software development methodologies for years. We are tired of XP, Scrum, Kanban, Waterfall, Software Craftsmanship (aka XP-Lite) and anything else getting in the way of...Programming, Motherfucker.
12 years ago by @hkorte
show all tags
programming
programming
copydelete
- community post
- history of this post
6Think Stats - Probability and Statistics for Programmers
Think Stats is an introduction to Probability and Statistics for Python programmers. It is completely available online in an HTML and a PDF version.
12 years ago by @hkorte
show all tags
book
creative_commons
statistics
bookcreative_commonsstatistics
copydelete
- community post
- history of this post
6Jade - Java Agent DEvelopment Framework
JADE (Java Agent DEvelopment Framework) is a software Framework fully implemented in Java language. It simplifies the implementation of multi-agent systems through a middle-ware that complies with the FIPA specifications and through a set of graphical tools that supports the debugging and deployment phases
13 years ago by @hkorte
show all tags
java
library
aose
agent
p2p
javalibraryaoseagentp2p
copydelete
- community post
- history of this post
1coffee-maven-plugin - Apache Maven Plugin for Coffeescript
https://github.com/talios/coffee-maven-plugin
13 years ago by @hkorte
show all tags
coffeescript
maven
javascript
coffeescriptmavenjavascript
copydelete
- community post
- history of this post
1jcoffeescript
JCoffeeScript is a java library that compiles CoffeeScript 1.1.
13 years ago by @hkorte
show all tags
library
coffeescript
tools
javascript
librarycoffeescripttoolsjavascript
copydelete
- community post
- history of this post
3RequireJS
RequireJS is a JavaScript file and module loader. It is optimized for in-browser use, but it can be used in other JavaScript environments, like Rhino and Node. Using a modular script loader like RequireJS will improve the speed and quality of your code.
13 years ago by @hkorte
show all tags
library
webapps
javascript
librarywebappsjavascript
copydelete
- community post
- history of this post
1List of resources: Article text extraction from HTML documents | My tech blog.
http://tomazkovacic.com/blog/56/list-of-resources-article-text-extraction-from-html-documents/
13 years ago by @hkorte
show all tags
web_article_extraction
tools
scraper
web_article_extractiontoolsscraper
copydelete
- community post
- history of this post
5Refine, reuse and request data | ScraperWiki
Scrape and link data using Ruby, Python and PHP scripts that run maintenance-free in the cloud. Request data for scoops and better decisions.
13 years ago by @hkorte
show all tags
python
web_article_extraction
php
tools
scraper
pythonweb_article_extractionphptoolsscraper
copydelete
- community post
- history of this post
1Adding Keyboard Navigation | jQuery for Designers - Tutorials and screencasts
A JQuery plugin to easily capture keyboard input.
13 years ago by @hkorte
show all tags
library
javascript
libraryjavascript
copydelete
- community post
- history of this post
1Handling Keyboard Shortcuts in JavaScript
Despite the many JavaScript libraries that are available today, I cannot find one that makes it easy to add keyboard shortcuts(or accelerators) to your javascript app. This is because keyboard shortcuts where only used in JavaScript games - no serious web application used keyboard shortcuts to navigate around its interface. But Google apps like Google Reader and Gmail changed that. So, I have created a function to make adding shortcuts to your application much easier.
13 years ago by @hkorte
show all tags
library
javascript
libraryjavascript
copydelete
- community post
- history of this post
1Unsupervised Semantic Parsing Source Code
Source code to repeat the paper evaluation: We present the first unsupervised approach to the problem of learning a semantic parser, using Markov logic. Our USP system transforms dependency trees into quasi-logical forms, recursively induces lambda forms from these, and clusters them to abstract away syntactic variations of the same meaning. The MAP semantic parse of a sentence is obtained by recursively assigning its parts to lambda-form clusters and composing them. We evaluate our approach by using it to extract a knowledge base from biomedical abstracts and answer questions. USP substantially outperforms TextRunner, DIRT and an informed baseline on both precision and recall on this task.
13 years ago by @hkorte
show all tags
knowledge_base_population
nlp
unsupervised
dependency_trees
markov_logic
tools
knowledge_base_populationnlpunsuperviseddependency_treesmarkov_logictools
copydelete
- community post
- history of this post
2Evaluating Text Extraction Algorithms | My tech blog.
Lately I’ve been working on evaluating and comparing algorithms, capable of extracting useful content from arbitrary html documents. I have made a feature wise comparison of related software and APIs.
13 years ago by @hkorte
show all tags
evaluation
text_extraction
html2text
evaluationtext_extractionhtml2text
copydelete
- community post
- history of this post
2Spark Cluster Computing Framework
Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop.
13 years ago by @hkorte
show all tags
scala
hadoop
tools
parallelization
scalahadooptoolsparallelization
copydelete
- community post
- history of this post
1AWS Elastic Beanstalk
AWS Elastic Beanstalk is an even easier way for developers to quickly deploy and manage applications in the AWS cloud without having to worry about the physical infrastructure or the resource configuration that accompanies setting up that infrastructure. You simply upload your application and AWS Elastic Beanstalk automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling, and application health monitoring, while allowing you to change configuration settings and deploy new versions.
13 years ago by @hkorte
show all tags
java
cloud
webapps
amazon
tools
programming
javacloudwebappsamazontoolsprogramming
copydelete
- community post
- history of this post
2jqMath
jqMath is a JavaScript module that makes it easy to put formatted mathematical expressions in web pages.
13 years ago by @hkorte
show all tags
web
math
jquery
javascript
webmathjqueryjavascript
copydelete
- community post
- history of this post
19Django | The Web framework for perfectionists with deadlines
Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
13 years ago by @hkorte
show all tags
framework
web
python
frameworkwebpython
copydelete
- community post
- history of this post
3Acronym Creator - find a name for your company, project, algorithm
Trying to find a name for a company, project, algorithm, product? Acronym Creator helps you generate a name that is an acronym or abbreviation. With this acronym builder, abbreviation maker, name generator, label finder - whatever you call it - you can make your own acronyms and have fun!
13 years ago by @hkorte
show all tags
tools
tools
copydelete
- community post
- history of this post
1Aptana Studio 3
The professional, open source development tool for the open web. Develop and test your entire web application using a single environment. With support for the latest browser technology specs such as HTML5, CSS3 and JavaScript; and Ruby, Rails, PHP & Python on the server side. We've got you covered!
13 years ago by @hkorte
show all tags
web
ide
tools
eclipse
javascript
webidetoolseclipsejavascript
copydelete
- community post
- history of this post
4Z-Type
Type To Shoot Game
13 years ago by @hkorte
show all tags
game
game
copydelete
- community post
- history of this post
1T-Verteilung
http://psydok.sulb.uni-saarland.de/volltexte/2004/268/html/tvert.htm
14 years ago by @hkorte
show all tags
statistics
statistics
copydelete
- community post
- history of this post
1JNI method and constructor signature cheat sheet
The JNI signatures that are required when getting methods and constructors are a bit hard to understand.
14 years ago by @hkorte
show all tags
java
c
jni
programming
javacjniprogramming
copydelete
- community post
- history of this post
1nodeunit - Unit testing in node.js
Node.js provides a its own assert module with some really useful functions for creating basic tests. However, the reporting and running of these assertions can become complicated, especially with asynchronous code. How can you be sure that all assertions ran? Or that they ran in the correct order? This is where nodeunit comes in, a tool for defining and running unit tests in the simplest way possible.
14 years ago by @hkorte
show all tags
node.js
unit_testing
tools
javascript
programming
node.jsunit_testingtoolsjavascriptprogramming
copydelete
- community post
- history of this post
1YAGO-NAGA - Javatools
The Javatools are a collection of Java classes for a variety of small tasks, such as parsing, database interaction or file handling. They were developed by Fabian M. Suchanek for the YAGO-NAGA project. The Javatools are licensed under a Creative Commons Attribution 3.0 License by the YAGO-NAGA team.
14 years ago by @hkorte
show all tags
java
free
tools
javafreetools
copydelete
- community post
- history of this post
1nweb: a tiny, safe Web server written in C
A simple Web server with only 200 lines of C source code. In this article, Nigel Griffiths provides a copy of this Web server and includes the source code as well. You can see exactly what it can and can't do.
14 years ago by @hkorte
show all tags
c
web
programming
cwebprogramming
copydelete
- community post
- history of this post
4Iconfinder - Free icons
Iconfinder provides high quality icons for webdesigners and developers in an easy and efficient way. Many icons are free for commercial use.
14 years ago by @hkorte
show all tags
data_source
icons
programming
data_sourceiconsprogramming
copydelete
- community post
- history of this post
1jquery-panel-magic
The aim of this jQuery plugin is to provide a unique way to "panelize" a website. It brings a new approach to website and web application navigation.
14 years ago by @hkorte
show all tags
library
web
javascript
librarywebjavascript
copydelete
- community post
- history of this post
1Text selector jquery plugin
http://perplexed.co.uk/1020_text_selector_jquery_plugin.htm
14 years ago by @hkorte
show all tags
web
tools
jquery
javascript
programming
webtoolsjqueryjavascriptprogramming
copydelete
- community post
- history of this post
2Use jQuery to Get User Selected Text
Shows how to use a jquery mouseup event handler to trigger the selection processing.
14 years ago by @hkorte
show all tags
web
tools
jquery
javascript
programming
webtoolsjqueryjavascriptprogramming
copydelete
- community post
- history of this post
1Kostenlose Fachbücher für das Studium
Kostenlose Fachbücher für das Studium als Ebooks
14 years ago by @hkorte
show all tags
ebooks
free
ebooksfree
copydelete
- community post
- history of this post
1Trainable Relation Extraction framework
T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.
14 years ago by @hkorte
show all tags
relation_extraction
tools
named_entity_recognition
relation_extractiontoolsnamed_entity_recognition
copydelete
- community post
- history of this post
1Polylingual Topic Models
Input: parallel corpora = translated documents A very interesting type of topic multilingual model without need to align words in parallel corpora.
14 years ago by @paass
show all tags
parallel
language
topic
model
corpora
multi
parallellanguagetopicmodelcorporamulti
copydelete
- community post
- history of this post
3MLComp
MLcomp is a free website for objectively comparing machine learning programs across various datasets for multiple problem domains.
14 years ago by @hkorte
show all tags
comparison
machine_learning
tools
comparisonmachine_learningtools
copydelete
- community post
- history of this post
4Apache Wicket
With proper mark-up/logic separation, a POJO data model, and a refreshing lack of XML, Apache Wicket makes developing web-apps simple and enjoyable again.
14 years ago by @hkorte
show all tags
java
web
javaweb
copydelete
- community post
- history of this post
1TimeML
Markup Language for Temporal and Event Expressions - TimeML is a robust specification language for events and temporal expressions in natural language.
14 years ago by @hkorte
show all tags
time_annotations
language
time_annotationslanguage
copydelete
- community post
- history of this post
33The Protégé Ontology Editor and Knowledge Acquisition System
Protégé is a free, open source ontology editor and knowledge-base framework. The Protégé platform supports two main ways of modeling ontologies via the Protégé-Frames and Protégé-OWL editors. Protégé ontologies can be exported into a variety of formats including RDF(S), OWL, and XML Schema. Protégé is based on Java, is extensible, and provides a plug-and-play environment that makes it a flexible base for rapid prototyping and application development.
14 years ago by @hkorte
show all tags
editor
java
opensource
tools
ontology
editorjavaopensourcetoolsontology
copydelete
- community post
- history of this post
1OntoLT - Middleware for Ontology Extraction from Text
The OntoLT approach aims at a more direct connection between ontology engineering and linguistic analysis. OntoLT is a Protégé plug-in, with which concepts (Protégé classes) and relations (Protégé slots) can be extracted automatically from linguistically annotated text collections. It provides mapping rules, defined by use of a precondition language that allow for a mapping between linguistic entities in text and class/slot candidates in Protégé.
14 years ago by @hkorte
show all tags
java
knowledge_base_population
opensource
tools
ontology
javaknowledge_base_populationopensourcetoolsontology
copydelete
- community post
- history of this post
5OLP3: 3rd Workshop on Ontology Learning and Population
http://olp.dfki.de/olp3/
14 years ago by @hkorte
show all tags
knowledge_base_population
workshop
ontology
knowledge_base_populationworkshopontology
copydelete
- community post
- history of this post
1AKBC - First Workshop on Automated Knowledge Base Construction
This workshop will gather researchers in a variety of fields that contribute to the automated construction of knowledge bases. It will be held at Xerox Research Centre Europe, near Grenoble (France), May 17-19, 2010.
14 years ago by @hkorte
show all tags
knowledge_base_population
conference
workshop
knowledge_base_populationconferenceworkshop
copydelete
- community post
- history of this post
10andLinux.org
andLinux runs Linux natively inside Windows. It is a complete Ubuntu Linux system running seamlessly in Windows 2000 based systems (2000, XP, 2003, Vista, 7; 32-bit versions only).
14 years ago by @hkorte
show all tags
opensource
virtual_machine
opensourcevirtual_machine
copydelete
- community post
- history of this post
1QxWT
QxWT is a JSNI-Wrapper for the Qooxdoo JavaScript library to use it in GWT applications.
14 years ago by @hkorte
show all tags
java
google_web_toolkit
widgets
javascript
javagoogle_web_toolkitwidgetsjavascript
copydelete
- community post
- history of this post
7qooxdoo
qooxdoo is a comprehensive and innovative framework for creating rich internet applications (RIAs). Leveraging object-oriented JavaScript allows developers to build impressive cross-browser applications. No HTML, CSS nor DOM knowledge is needed.
14 years ago by @hkorte
show all tags
java
free_licence
ria
google_web_toolkit
tools
javascript
javafree_licenceriagoogle_web_toolkittoolsjavascript
copydelete
- community post
- history of this post
1Watij - Web Application Testing in Java
Watij (pronounced wattage) stands for Web Application Testing in Java. It is a pure Java API created to allow for the automation of web applications.
14 years ago by @hkorte
show all tags
java
www
tools
javawwwtools
copydelete
- community post
- history of this post
2Read the Web Research Project at Carnegie Mellon
Our goal is to develop a probabilistic knowledge base that mirrors the content of the web. We are developing a system that uses semi-supervised learning methods to learn to extract symbolic knowledge from unstructured text and HTML. We are exploring methods of continous learning, where our system runs 24x7, continuously learning to read better, and continuously extracting facts from the web.
15 years ago by @hkorte
show all tags
knowledge_base_population
project
information_extraction
ontology
knowledge_base_populationprojectinformation_extractionontology
copydelete
- community post
- history of this post
2Videolecture: Populating the Semantic Web by Macro-Reading Internet Text
Tom Mitchell (2009): self-supervised KBP, only NPs without Entity Linking
15 years ago by @hkorte
show all tags
knowledge_base_population
videolectures
www
information_extraction
semanticweb
self-supervised
ontology
knowledge_base_populationvideolectureswwwinformation_extractionsemanticwebself-supervisedontology
copydelete
- community post
- history of this post
1Word Document Text Extractor
This java class extracts the text from a Word 6.0/95/97/2000/XP word document.
15 years ago by @hkorte
show all tags
java
text_extraction
tools
javatext_extractiontools
copydelete
- community post
- history of this post
1Videolecture: Human Language technology for the Semantic Web
Paul Buitelaar (2005)
15 years ago by @hkorte
show all tags
knowledge_base_population
videolectures
to_view
semanticweb
ontology
knowledge_base_populationvideolecturesto_viewsemanticwebontology
copydelete
- community post
- history of this post
1The UCSD Multiple Kernel Learning Repository
A collection of data sets for use with multiple kernel learning algorithms.
15 years ago by @hkorte
show all tags
data_source
svm
kernels
classification
multiple_kernel_learning
data_sourcesvmkernelsclassificationmultiple_kernel_learning
copydelete
- community post
- history of this post
1Darmstadt Knowledge Processing Repository (DKPro Repository)
The DKPro Repository aims at providing the NLP research community with a collection of ready-to-use, robust NLP components for Apache UIMA.
15 years ago by @hkorte
show all tags
framework
uima
uima_components
frameworkuimauima_components
copydelete
- community post
- history of this post
4ConceptNet
ConceptNet represents data in the form of a semantic network, and makes it available to be used in natural language processing and intelligent user interfaces.
15 years ago by @hkorte
show all tags
nlp
corpus
WordNet
ontology
nlpcorpusWordNetontology
copydelete
- community post
- history of this post
5Open Source Web Crawlers Written in Java
http://www.manageability.org/blog/stuff/open-source-web-crawlers-java
15 years ago by @hkorte
show all tags
java
crawling
tools
javacrawlingtools
copydelete
- community post
- history of this post
210 open source books worth downloading
http://www.tectonic.co.za/?p=4491
15 years ago by @hkorte
show all tags
book
opensource
bookopensource
copydelete
- community post
- history of this post
2MegaMap - A simple, unbounded hashtable for Java
MegaMap is a Java implementation of a map (or hashtable) that can store an unbounded amount of data, limited only by the amount of disk space available. Objects stored in the map are persisted to disk. Good performance is achieved by an in-memory cache. The MegaMap can, for all practical reasons, be thought of as a map implementation with unlimited storage space.
15 years ago by @hkorte
show all tags
java
java
copydelete
- community post
- history of this post
1Cibyl
Cibyl is a programming environment and binary translator that allows compiled C programs to execute on J2ME-capable phones. Cibyl uses GCC to compile the C programs to MIPS binaries, and these are then recompiled into Java bytecode.
15 years ago by @hkorte
show all tags
c-to-java-translator
tools
c-to-java-translatortools
copydelete
- community post
- history of this post
3NestedVM
NestedVM provides binary translation for Java Bytecode. This is done by having GCC compile to a MIPS binary which is then translated to a Java class file. Hence any application written in C, C++, Fortran, or any other language supported by GCC can be run in 100% pure Java with no source changes.
15 years ago by @hkorte
show all tags
c-to-java-translator
tools
c-to-java-translatortools
copydelete
- community post
- history of this post
2TreeTagger for Java
TreeTagger for Java is a Java wrapper around the popular TreeTagger package by Helmut Schmid.
15 years ago by @hkorte
show all tags
java
pos
tools
javapostools
copydelete
- community post
- history of this post
1Making browsers faster: Resource Packages
A proposal to make downloading web page resources faster in all browsers (by the use of compressed jar files for imaged etc.)
15 years ago by @hkorte
show all tags
webapps
tools
programming
webappstoolsprogramming
copydelete
- community post
- history of this post
2Jericho HTML Parser
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
15 years ago by @hkorte
show all tags
java
parser
opensource
tools
javaparseropensourcetools
copydelete
- community post
- history of this post
1Cleaneval development dataset
CLEANEVAL is a shared task and competitive evaluation on the topic of cleaning arbitrary web pages, with the goal of preparing web data for use as a corpus, for linguistic and language technology research and development.
15 years ago by @hkorte
show all tags
data_source
corpus
eval_corpus
html2text
data_sourcecorpuseval_corpushtml2text
copydelete
- community post
- history of this post
1TextPro tool
TextPro is a suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts.
15 years ago by @hkorte
show all tags
nlp
tools
named_entity_recognition
nlptoolsnamed_entity_recognition
copydelete
- community post
- history of this post
1GNU Emacs Manual
Emacs is the extensible, customizable, self-documenting real-time display editor. This Info file describes how to edit with Emacs and some of how to customize it; it corresponds to GNU Emacs version 23.1.
15 years ago by @hkorte
show all tags
reference
emacs
manual
referenceemacsmanual
copydelete
- community post
- history of this post
1GNU Bash Reference Manual
This text is a brief description of the features that are present in the Bash shell.
15 years ago by @hkorte
show all tags
reference
bash
manual
referencebashmanual
copydelete
- community post
- history of this post
2Semantic Role Labeling Demo (CCG)
Online Demo of a Semantic Role Labeling system.
15 years ago by @hkorte
show all tags
semantic_role_labeling
demo
tools
semantic_role_labelingdemotools
copydelete
- community post
- history of this post
1DNB - Normdaten-DVD-ROM
Diese DVD-ROM der Deutschen Nationalbibliothek enthält sowohl die Personennamendatei (PND) als auch die Schlagwortnormdatei (SWD) und die Gemeinsame Körperschaftsdatei (GKD) und ist direkt über die Deutsche Nationalbibliothek zu beziehen.
15 years ago by @hkorte
show all tags
data_source
database
data_sourcedatabase
copydelete
- community post
- history of this post
1GWT API's for SmartClient
SmartGWT is a GWT based framework that allows you to not only utilize its comprehensive widget library for your application UI, but also tie these widgets in with your server-side for data management. SmartGWT is based on the powerful and mature SmartClient library.
15 years ago by @hkorte
show all tags
java
google_web_toolkit
tools
license_lgpl
javascript
programming
javagoogle_web_toolkittoolslicense_lgpljavascriptprogramming
copydelete
- community post
- history of this post
1Ext GWT - Java Component Library
Ext GWT: Rich Internet Application Framework for GWT.
15 years ago by @hkorte
show all tags
java
ria
google_web_toolkit
opensource
javascript
programming
javariagoogle_web_toolkitopensourcejavascriptprogramming
copydelete
- community post
- history of this post
1log4javascript
A logging framework for JavaScript based on log4j.
15 years ago by @hkorte
show all tags
library
logging
javascript
libraryloggingjavascript
copydelete
- community post
- history of this post
2CoNLL-2005 Shared Task: Semantic Role Labeling
http://www.lsi.upc.edu/~srlconll/
15 years ago by @hkorte
show all tags
data_source
semantic_role_labeling
data_sourcesemantic_role_labeling
copydelete
- community post
- history of this post
1Computer Aided Translation Tool
Caitra is an experimental translation tool developed by the Machine Translation Group at the University of Edinburgh.
15 years ago by @hkorte
show all tags
machine_translation
tools
machine_translationtools
copydelete
- community post
- history of this post
7Joda Time - Java date and time API
Joda-Time provides a quality replacement for the Java date and time classes. The design allows for multiple calendar systems, while still providing a simple API. The 'default' calendar is the ISO8601 standard which is used by XML. The Gregorian, Julian, Buddhist, Coptic, Ethiopic and Islamic systems are also included, and we welcome further additions. Supporting classes include time zone, duration, format and parsing.
15 years ago by @hkorte
show all tags
java
library
tools
javalibrarytools
copydelete
- community post
- history of this post
3Apache POI - Java API To Access Microsoft Format Files
The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format, and Office OpenXML format, using pure Java. In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java.
15 years ago by @hkorte
show all tags
java
library
tools
javalibrarytools
copydelete
- community post
- history of this post
11iText, a Free Java-PDF Library
iText is a library that allows you to generate PDF files on the fly.
15 years ago by @hkorte
show all tags
java
pdf
library
tools
javapdflibrarytools
copydelete
- community post
- history of this post
1JBoss PojoCache Tutorial
PojoCache is an in-memory, transactional, and replicated POJO (plain old Java object) cache system that allows users to operate on a POJO transparently without active user management of either replication or persistency aspects. This tutorial focuses on the usage of the PojoCache API.
15 years ago by @hkorte
show all tags
java
db_collections
tutorials
jboss
pojo_cache
javadb_collectionstutorialsjbosspojo_cache
copydelete
- community post
- history of this post
1JBoss Cache as a POJO Cache
Tutorial on JBossCache POJO.
15 years ago by @hkorte
show all tags
java
db_collections
jboss
tutorial
pojo_cache
tools
javadb_collectionsjbosstutorialpojo_cachetools
copydelete
- community post
- history of this post
1BigTable: Google’s Distributed Data Store
http://hnr.dnsalias.net/wordpress/2008/10/bigtable-googles-distributed-data-store/
15 years ago by @hkorte
show all tags
to_read
to_read
copydelete
- community post
- history of this post
14Apache Solr
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, a web administration interface and many more features. It runs in a Java servlet container such as Tomcat.
15 years ago by @hkorte
show all tags
java
lucene
tools
search
javalucenetoolssearch
copydelete
- community post
- history of this post
6LingPipe
LingPipe is a suite of Java libraries for the linguistic analysis of human language.
15 years ago by @hkorte
show all tags
java
nlp
tools
javanlptools
copydelete
- community post
- history of this post
4SecondString Project
This is the project page for SecondString, an open-source Java-based package of approximate string-matching techniques. This code was developed by researchers at Carnegie Mellon University from the Center for Automated Learning and Discovery, the Department of Statistics, and the Center for Computer and Communications Security.
15 years ago by @hkorte
show all tags
java
nlp
tools
string_similarity
javanlptoolsstring_similarity
copydelete
- community post
- history of this post
2SimMetrics
SimMetrics is a Similarity Metric Library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro etc) to other metrics, (e.g Soundex, Chapman).
15 years ago by @hkorte
show all tags
java
nlp
tools
string_similarity
javanlptoolsstring_similarity
copydelete
- community post
- history of this post
3Java Open Source NLP and Text Mining tools
This is an overview of the open source NLP and machine learning tools for text mining, information extraction, text classification, clustering, approximate string matching, language parsing and tagging, and more.
15 years ago by @hkorte
show all tags
nlp
tools
textmining
nlptoolstextmining
copydelete
- community post
- history of this post
1Extract RSS feeds from Web pages
Approach to convert any Web data into RSS format.
15 years ago by @hkorte
show all tags
rss
web_article_extraction
www
information_extraction
crawling
tools
C#
rssweb_article_extractionwwwinformation_extractioncrawlingtoolsC#
copydelete
- community post
- history of this post
1Webstemmer
Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation links mixed up
15 years ago by @hkorte
show all tags
python
web_article_extraction
www
information_extraction
crawling
tools
pythonweb_article_extractionwwwinformation_extractioncrawlingtools
copydelete
- community post
- history of this post
2reWork: a regular expression workbench
Helps to test regular expressions
15 years ago by @hkorte
show all tags
regex
tools
regextools
copydelete
- community post
- history of this post
3Headache relief for programmers - Regular Expression Generator
Helps to build a Regex based on an example string
15 years ago by @hkorte
show all tags
regex_generation
regex
tools
regex_generationregextools
copydelete
- community post
- history of this post
2ICEpdf
ICEpdf is an open source Java PDF library ideal for displaying and printing PDF documents within any Java application.
15 years ago by @hkorte
show all tags
java
pdf
pdfrenderer
api
javapdfpdfrendererapi
copydelete
- community post
- history of this post
2The Road Runner Project
Towards Automatic Data Extraction from Large Web Sites
15 years ago by @hkorte
show all tags
java
regex
www
information_extraction
crawling
javaregexwwwinformation_extractioncrawling
copydelete
- community post
- history of this post
1Bundesliga - Die offizielle Webseite
Offizielle Fußball-Statistiken
15 years ago by @hkorte
show all tags
data_source
sports_betting
data_sourcesports_betting
copydelete
- community post
- history of this post
1Das Fußball Studio
Das Fußball Studio ist eine Freeware, mit der Fussball-Ligen und -Turniere verwaltet und ausgewertet werden können. Dazu die Bundesliga-Datenbank mit vollständigen Daten der 1. und 2. Bundesliga.
15 years ago by @hkorte
show all tags
data_source
sports_betting
data_sourcesports_betting
copydelete
- community post
- history of this post
1Mozilla Labs Jetpack
API for creating Firefox add-ons with HTML, CSS and JavaScript.
15 years ago by @hkorte
show all tags
API
javascript
programming
APIjavascriptprogramming
copydelete
- community post
- history of this post

⟨⟨
⟨
1
2
⟩
⟩⟩

publications (hide)214
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

3A tutorial on principal components analysis
L. Smith. Cornell University, USA, (February 2002)
11 years ago by @hkorte
show all tags
tutorial
PCA
tutorialPCA
copydeleteadd this publication to your clipboard
2BNS feature scaling: an improved representation over tf-idf for svm text classification
G. Forman. Proceeding of the 17th ACM conference on Information and knowledge management, page 263--270. New York, NY, USA, ACM, (2008)
11 years ago by @hkorte
show all tags
svm
feature_selection
feature_scaling
svmfeature_selectionfeature_scaling
copydeleteadd this publication to your clipboard
3SpotSigs: robust and efficient near duplicate detection in large web collections.
M. Theobald, J. Siddharth, and A. Paepcke. SIGIR, page 563-570. ACM, (2008)
12 years ago by @hkorte
show all tags
duplicate_detection
web
crawling
duplicate_detectionwebcrawling
copydeleteadd this publication to your clipboard
3The Entire Regularization Path for the Support Vector Machine.
T. Hastie, S. Rosset, R. Tibshirani, and J. Zhu. Journal of Machine Learning Research, (2004)
13 years ago by @hkorte
show all tags
svm_tuning
svm
svm_tuningsvm
copydeleteadd this publication to your clipboard
5Unsupervised Semantic Parsing.
H. Poon, and P. Domingos. EMNLP, page 1-10. ACL, (2009)
13 years ago by @hkorte
show all tags
knowledge_base_population
nlp
unsupervised
dependency_trees
knowledge_base_populationnlpunsuperviseddependency_trees
copydeleteadd this publication to your clipboard
7An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains
E. Riloff. Artificial Intelligence, 85 (1-2): 101-134 (1996)
14 years ago by @hkorte
show all tags
nlp
linguistics
text_mining
information_extraction
nlplinguisticstext_mininginformation_extraction
copydeleteadd this publication to your clipboard
2Self-supervised relation extraction from the Web
B. Rosenfeld, and R. Feldman. Knowledge and Information Systems, 17 (1): 17-33 (2008)
14 years ago by @hkorte
show all tags
relation_extraction
unsupervised
www
relation_extractionunsupervisedwww
copydeleteadd this publication to your clipboard
3Extracting Relations with Integrated Information Using Kernel Methods
S. Zhao, and R. Grishman. ACL, The Association for Computer Linguistics, (2005)
14 years ago by @hkorte
show all tags
relation_extraction
kernels
relation_extractionkernels
copydeleteadd this publication to your clipboard
2Semantic relation extraction with kernels over typed dependency trees
F. Reichartz, H. Korte, and G. Paass. KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, page 773--782. New York, NY, USA, ACM, (2010)
14 years ago by @hkorte
show all tags
ACE
relation_extraction
tree_kernels
dependency_trees
ACErelation_extractiontree_kernelsdependency_trees
copydeleteadd this publication to your clipboard
7RelExt: A Tool for Relation Extraction from Text in Ontology Extension
A. Schutz, and P. Buitelaar. The Semantic Web - ISWC 2005, Springer, (2005)
14 years ago by @hkorte
show all tags
knowledge_base_population
relation_extraction
ontology
knowledge_base_populationrelation_extractionontology
copydeleteadd this publication to your clipboard
1Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision
D. Nadeau. (2007)
14 years ago by @hkorte
show all tags
thesis
semi-supervised
phd
named_entity_recognition
thesissemi-supervisedphdnamed_entity_recognition
copydeleteadd this publication to your clipboard
11A survey of named entity recognition and classification
D. Nadeau, and S. Sekine. Linguisticae Investigationes, 30 (1): 3--26 (January 2007)Publisher: John Benjamins Publishing Company.
14 years ago by @hkorte
show all tags
survey
named_entity_recognition
surveynamed_entity_recognition
copydeleteadd this publication to your clipboard
2Automated Construction and Growth of a Large Ontology
F. Suchanek. (2009)
14 years ago by @hkorte
show all tags
knowledge_base_population
phd
ontology
knowledge_base_populationphdontology
copydeleteadd this publication to your clipboard
12SOFIE: A Self-Organizing Framework for Information Extraction
F. Suchanek, M. Sozio, and G. Weikum. International World Wide Web conference (WWW 2009), New York, NY, USA, ACM Press, (2009)
14 years ago by @hkorte
show all tags
knowledge_base_population
information_extraction
ontology
knowledge_base_populationinformation_extractionontology
copydeleteadd this publication to your clipboard
6Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents
F. Suchanek, G. Ifrim, and G. Weikum. 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), page 712--717. New York, NY, USA, ACM, (2006)
14 years ago by @hkorte
show all tags
knowledge_base_population
relation_extraction
www
svm
knn
knowledge_base_populationrelation_extractionwwwsvmknn
copydeleteadd this publication to your clipboard
1Text Mining and Multimedia Search in a Large Content Repository
G. Paaß\, S. Eickeler, and S. Wrobel. Proc. Sabre Conference on Text Minng Services (TMS '09) 2009, Leipzig, page 15-22. (2009)
14 years ago by @paass
copydeleteadd this publication to your clipboard
2Towards terascale knowledge acquisition
P. Pantel, D. Ravichandran, and E. Hovy. Proceedings of the 20th international conference on Computational Linguistics (COLING-04), page 771--777. Geneva, Switzerland, Association for Computational Linguistics, (2004)
14 years ago by @hkorte
show all tags
knowledge_base_population
knowledge_base_population
copydeleteadd this publication to your clipboard
1New Filtering Approaches for Phishing Email
A. Bergholz, J. Beer, S. Glahn, M. Moens, G. Paass, and S. Strobel. Journal of Computer Security, 18 (1): 7-35 (2010)
14 years ago by @paass
show all tags
phishing
approaches
email
security
filtering
textmining
phishingapproachesemailsecurityfilteringtextmining
copydeleteadd this publication to your clipboard
3Coupled Semi-Supervised Learning for Information Extraction
A. Carlson, J. Betteridge, R. Wang, E. Jr., and T. Mitchell. WSDM '10: Proceedings of the third ACM international conference on Web search and data mining, page 101--110. New York, NY, USA, ACM, (2010)
14 years ago by @hkorte
show all tags
knowledge_base_population
knowledge_base_population
copydeleteadd this publication to your clipboard
8Ontology Learning and Population: Bridging the Gap between Text and Knowledge
P. Buitelaar, and P. Cimiano (Eds.) Frontiers in Artificial Intelligence and Applications IOS Press, Amsterdam, (2008)
14 years ago by @hkorte
show all tags
knowledge_base_population
ontology
knowledge_base_populationontology
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
2
3
⟩
⟩⟩

bookmarks (hide)176 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide)214 displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

KD Text Mining

discussion

tags

bookmarks (hide)176
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)214
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...