@hangdong

Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking

, , , , and . (2023)cite arxiv:2302.07189.

Abstract

Discovering entity mentions that are out of a Knowledge Base (KB) from texts plays a critical role in KB maintenance, but has not yet been fully explored. The current methods are mostly limited to the simple threshold-based approach and feature-based classification; the datasets for evaluation are relatively rare. In this work, we propose BLINKout, a new BERT-based Entity Linking (EL) method which can identify mentions that do not have a corresponding KB entity by matching them to a special NIL entity. To this end, we integrate novel techniques including NIL representation, NIL classification, and synonym enhancement. We also propose Ontology Pruning and Versioning strategies to construct out-of-KB mentions from normal, in-KB EL datasets. Results on four datasets of clinical notes and publications show that BLINKout outperforms existing methods to detect out-of-KB mentions for medical ontologies UMLS and SNOMED CT.

Description

[2302.07189] Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking

Links and resources

Tags

community