
Extracting patterns and relations from the world wide web

. In WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT’98, Seite 172--183. (1998)


Abstract. The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many di erent formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically. We present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author,title) pairs from the World Wide Web. 1


CiteSeerX — Extracting patterns and relations from the world wide web

Links und Ressourcen



  • @seb
  • @kabloom
  • @pirot
  • @gerhard.wohlgenannt
  • @hkorte
  • @huiyangsfsu
  • @dblp
  • @jil
  • @cbrewster
@jils Tags hervorgehoben