@jil

Extracting patterns and relations from the world wide web

. In WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT’98, page 172--183. (1998)

Abstract

Abstract. The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many di erent formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically. We present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author,title) pairs from the World Wide Web. 1

Description

CiteSeerX — Extracting patterns and relations from the world wide web

Links and resources

Tags

community

  • @seb
  • @kabloom
  • @pirot
  • @gerhard.wohlgenannt
  • @hkorte
  • @huiyangsfsu
  • @dblp
  • @jil
  • @cbrewster
@jil's tags highlighted