Balancing volume, quality and freshness in Web crawling

Abstract

We describe a crawling software designed for high-performance, large-scale information discovery and gathering on the Web. This crawler allows the administrator to seek for a balance between the volume of a Web collection and its freshness; and also provides flexibility for defining a quality metric to prioritize certain pages.

BibTeX key: baeza02balancing
entry type: inproceedings
address: Santiago, Chile
booktitle: Soft Computing Systems - Design, Management and Applications
year: 2002
chapter: Web Computing
pages: 565--572
publisher: IOS Press Amsterdam
posted-at: 2005-06-30 11:33:48
citeulike-article-id: 240761
priority: 0
citeulike-linkout-0: http://www.dcc.uchile.cl/\%7Eccastill/papers/baeza02balancing.pdf
Document: http://www.dcc.uchile.cl/\%7Eccastill/papers/baeza02balancing.pdf

BibSonomy

Balancing volume, quality and freshness in Web crawling

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on