copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Crawling the Web

G. Pant, P. Srinivasan, and F. Menczer. page 153--177. Springer Berlin Heidelberg, Berlin, Heidelberg, (2004)
DOI: 10.1007/978-3-662-10874-1_7

Abstract

The large size and the dynamic nature of the Web make it necessary to continually maintain Web based information retrieval systems. Crawlers facilitate this process by following hyperlinks in Web pages to automatically download new and updated Web pages. While some systems rely on crawlers that exhaustively crawl the Web, others incorporate ``focus'' within their crawlers to harvest application- or topic-specific collections. In this chapter we discuss the basic issues related to developing an infrastructure for crawlers. This is followed by a review of several topical crawling algorithms, and evaluation metrics that may be used to judge their performance. Given that many innovative applications of Web crawling are still being invented, we briefly discuss some that have already been developed.

Links and resources

BibTeX key: Pant2004
entry type: inbook
address: Berlin, Heidelberg
booktitle: Web Dynamics: Adapting to Change in Content, Size, Topology and Use
year: 2004
pages: 153--177
publisher: Springer Berlin Heidelberg
isbn: 978-3-662-10874-1
DOI: 10.1007/978-3-662-10874-1_7
url: https://doi.org/10.1007/978-3-662-10874-1_7

@parismic's tags highlighted

Cite this publication

search on

Meta data

Last update 4 years ago
Created 4 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Crawling the Web

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Crawling the Web

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Crawling the Web

Comments and Reviews
(0)