By clicking on the link below, you will be given a temporary e-mail address. Any e-mails sent to that address will show up automatically on the web page. You can read them, click on links, and even reply to them. The e-mail address will expire after 10 minutes.
This document explains how to do HTML screen scraping. In effect it shows how to treat the Web as a resource by enabling you to retrieve and extract data from HTML Web pages.
Bow (or libbow) is a library of C code useful for writing statistical text analysis, language modeling and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (arrow) and document clustering (crossbow).