This is a collection of bibliographies of scientific literature in computer science from various sources, covering most aspects of computer science. The bibliographies are updated weekly from their original locations such that you'll always find the most recent versions here.
Em Aprendizado de Máquina, a abordagem supervisionada normalmente necessita de um número significativo de exemplos de treinamento para a indução de classificadores precisos. Entretanto, a rotulação de dados é freqüentemente realizada manualmente,
Sistemas Hipermídia são programas capazes de armazenar e recuperar informações não-lineares, estabelecendo uma estrutura complexa e flexível representada por nós interligados. À medida em que aumenta o espaço de navegação, tal como acontece na
Open source tools have recently reached a level of maturity which makes them suitable for building large-scale real-world systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for a wide[...]
This is a large online bibliography on automated text categorization (ATC). You can either view it or download it as a single file (ASCII text in BibTex format) or access the fully searchable online version.
Concept mining is a discipline at the nexus of data mining, text mining, and linguistics, drawing on artificial intelligence and statistics. It aims to extract concepts from documents.
Fast Artificial Neural Network Library is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution
SIMBRAIN is a free tool for building, running, and analyzing neural-networks (computer simulations of brain circuitry). Simbrain aims to be as visual and easy-to-use as possible.
Vintage design, when done well, can make a user feel like they have been transported back in time. This particular style often uses design elements that look like they were found in the attic of an old house dating back to the 1920’s, 30’s, or 40’s.
This project contains Naive and Fishers bayesian classifiers, as described in Toby Segaran's book "Programming Collective Intelligence." The book has python implementations; this is a Java implementation.
ci-bayes, a project hosted on java.net, has released its first stable version. ci-bayes allows the use of a classifier to determine what classification a given object might fall into, given prior training, and provides multiple
English translation of selected chapters of the WikiWord thesis "Automatischer Aufbau eines multilingualen Thesaurus durch Extraktion semantischer und lexikalischer Relationen aus der Wikipedia" by Daniel Kinzler. Translation by the author.
My diploma thesis about a system to automatically build a multilingual thesaurus from wikipedia, "WikiWord", is finally done. I handed it in yesterday. My research will hopefully help to make Wikipedia more accessible for automatic processing
ConceptNet is a freely available commonsense knowledgebase and natural-language-processing toolkit which supports many practical textual-reasoning tasks over real-world documents right out-of-the-box (without additional statistical training) including
JavaNNS is the successor of SNNS. It is based on its computing kernel, with a newly developed, comfortable graphical user interface written in Java set on top of it. Hence the compatibility with SNNS is achieved, while the platform-independence is increa
If you are starting with Neural Networks you should check out my online book on the subject. It contains over 300 pages of information on Neural Network Programming in Java. You can access it here.
Kilim is a message-passing framwork for Java that provides ultra-lightweight threads and facilities for fast, safe, zero-copy messaging between these threads.
If you work from home, you know you’re lucky. Friends and family envy your ability to sleep late, take breaks and manage your own schedule and project list. But couldn’t things get even easier? Instead of rolling out of bed and stumbling
I’m currently in the latter stages of writing my master thesis. I’ve been using LaTeX from the start and have learnt a few tricks for how to work most effectively with large documents like theses and books.
Today we're excited to announce that we're open sourcing reddit. We've always strived to be as open and transparent with our users as possible, and this is the next logical step. When we say 'open-source' we mean specifically that the code behind reddit
So, a while ago, I’ve decided to code a library to plot some information I had. The idea was to create simple graphics in a way they would be easy to create, beautiful and good to present to people with no or few backgrounds on math and computers.
One thing I really love with the Python programming language is its incredible extensibility. Here’s a list of 50 awesome modules for Python, covering almost all needs: Databases, GUIs, Images, Sound, OS interaction, Web, and more.
Here's a realtively simple way to implement data versioning in a database, in a way that should be scalable as well. It only needs a couple of support tables and a single function and can apply versioning across multiple data sets concurrently.
Michael Abrash's classic Graphics Programming Black Book is a compilation of Michael's writings on assembly language and graphics programming (including from his "Graphics Programming" column in Dr. Dobb's Journal
We want to make all the world's content more accessible, interoperable and valuable. Some call it Web 2.0, Web 3.0, the Semantic Web or the Giant Global Graph - we call our piece of it Calais.
Tagaroo is designed to make your WordPress blog better for you, better for your readers and more accessible to search engines. As you’re writing, Tagaroo analyzes the text in your post and suggests intelligent tags for the things and events you’re
Based on the same architectural pattern of the web, "REST" has a growing dominance of the SOA (Service Oriented Architecture) implementation these days. In this article, we will discuss some basic design principles of REST.
"RRDtool is the Open Source industry standard, high performance data logging and graphing system for time series data. It stores the data in a very compact way that will not expand over time, and it can create beautiful graphs."
In general, processes take longer to start than threads. This makes sense if you think about it - a thread lives within the memory space of its parent process, so it takes less work
The AJAX Libraries API is a content distribution network and loading architecture for the most popular open source JavaScript libraries. By using the Google AJAX API Loader's google.load() method, your application has high speed, globaly available access
A 100% cotton, black t-shirt features a pile of dead, white kittens in a little puddle of adorable blood with the caption, "Every time you Can Has, God kills a LOLcat."
For every musical instrument that becomes a symbol of modern music, many more are doomed to become the retarded cousin that gets stashed in the basement during dinner parties. Below, we present the best of those retarded cousins.
Getting started with Jersey is very easy. First, download the latest distribution of Jersey and unzip it. If you are using NetBeans IDE 6.x, you do not need to download the Jersey distribution. Instead, install the RESTful Web Services plugin from the Plu
The Django REST interface makes it easy to offer private and public APIs for existing Django models. New generic views simplify data retrieval and modification in a resource-centric architecture
A while ago I created a lot of icons for web applications I designed while working for the most excellent employer there is: Manentia Software. Now, we decided to liberate them with a Creative Commons license for everyone to use and enjoy!
In this interview, recorded at QCon London, Jim Webber, ThoughtWorks SOA practice leader talks to Stefan Tilkov about Guerilla SOA, Description Language (SSDL).
In analyzing my data I wanted to classify it with a naive Bayesian classifier. I wasn't sure I had the math right, so I wrote a tiny abstract classifier to test with. The code is pretty cool:
The process of writing large parallel programs is complicated by the need to specify both the parallel behaviour of the program and the algorithm that is to be used to compute its result.
Duplicity is a backup program that only backs up the files (and parts of files) that have been modified since the last backup. Built on FLOSS (rsync, GnuPG, tar, and rdiff), it allows efficient, locally encrypted, remote backups.
Based on this great blog post by Tim McCormack, I managed to write some scripts that back up files to Amazon S3. The files are encrypted with GnuPG and rsync-ed to S3 using a Python-based tool called duplicity.
Welcome to the Unix Tree. Here you can browse the source code and manuals of various old versions of UNIX. For every file, you can also find related files from other versions: this can help show how the different versions of UNIX are related.
Want to write shorter, cleaner code? Have an unfortunate situation where you need to fit as much as you can in one expression? Prefer a quick dose of hacks to spending the rest of your life reading the docs? You've come to the right place.
Today we'll be writing a simple todo list application. My goal is not to show you the finer points of todo lists, but rather to show you how to properly set up a webpy project for small to medium sized applications.
Once again a post aimed at the PHP community, not so much of a rant but more of something I’ve seen done horribly wrong in a lot of PHP code recently, first let me take a few examples from a couple of well known PHP frameworks and libraries:
With the new version of OS X (Leopard) Apple has included some great functionality in Time Machine. Your Mac will automatically backup to an external drive every hour. It includes the ability to recover deleted files in a timeline.
Amazon's S3 is an online storage solution; you pay for only what you use ($0.15/GB/month, plus some transfer costs). I wrote a simple step-by-step guide to setting you a Mac to sync with Amazon S3; here's the executive summary version:
Myghty is a Python based templating framework originally based on HTML::Mason, the enterprise-level framework used by Amazon.com, del.icio.us and Salon.com,
Content Management System, pronounced "CMS", is a generic term for all types of system to manage some kind of content. If you think this sounds loose and fluffy, it is.
The CMF is the Content Management Framework, an important additional piece of infrastructure for Zope which has also influenced the design of Zope 3. CMF is required by Plone and Nuxeo CPS, also ERP5 uses it (see their prerequisites list);
ended the last post with a discussion of the fundamental problem with Django's Object-Relational Mapper, namely, that it is developed for and by Django
When a web application receives more requests than it can handle over a short period of time, it can become unresponsive. In the worst case, too many concurrent requests to a web application can cause the software which services the application to crash.
In an earlier post Over on the Twisted blog, Duncan McGreggor has asked us to expand a bit on where we think Twisted may be lacking in it’s support for concurrency. I’m afraid this has turned into a meandering essay, since I needed to reference so muc
Several years ago, a client asked me to come up with a prototype for a real-money online poker bot. That's right: a piece of software you park on your computer while it goes out to a site like PokerStars or Full Tilt and plays no-limit Holdem for you
To even try to keep pace with the rapid evolution of game development, you need a strong foundation in core programming techniques-not a hefty volume on one narrow topic or one that devotes itself to API-specific implementations. Finally, there's a guid
Ksplice allows system administrators to apply security patches to the Linux kernel without having to reboot. Ksplice takes as input a source code change in unified diff format and the kernel source code to be patched, and it applies the patch to the corre
If web architectures, performance, or scalability are topics you would like to keep on top of (who doesn't!), then chances are, you've heard of Nginx ("engine x"). Originally developed by Igor Sysoev for rambler.ru (second largest Russian web-site), it is
This article was inspired by a question asked by a user on the WriteRoom Forum. This person was asking about facilities for working with multiple drafts of a work in progress; a process he was currently doing manually by saving his work with a new timesta