Zusammenfassung

The recognition of Proper Nouns (PNs) is considered an important task in the area of Information Retrieval and Extraction. However the high performance of most existing PN classifiers heavily depends upon the avail-ability of large dictionaries of domain-specific Proper Nouns, and a certain amount of manual work for rule writing or manual tagging. Though it is not a heavy requirement to rely on some existing PN dictionary (of-ten these resources are available on the web), its coverage of a domain corpus may be rather low, in absence of manual updating. In this paper we propose a technique for the automatic updating of a PN Dictionary through the cooperation of an inductive and a probabilistic classifier. In our experiments we show that, whenever an existing PN Dictionary allows the identification of 50\% of the proper nouns within a corpus, our technique allows, without additional manual effort, the successful recognition of about 90\% of the remaining 50\%.

Links und Ressourcen

Tags

Community

  • @petasis
  • @dblp
@petasiss Tags hervorgehoben