Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries
B. Heinzerling, and K. Inui. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, page 1772--1791. Online, Association for Computational Linguistics, (April 2021)
Abstract
Pretrained language models have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose name is found in common LM vocabularies. Furthermore, a major benefit of this paradigm, i.e., querying the KB using natural language paraphrases, is underexplored. Here we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to handle millions of entities and present a detailed case study on paraphrased querying of facts stored in LMs, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.
%0 Conference Paper
%1 heinzerling2021language
%A Heinzerling, Benjamin
%A Inui, Kentaro
%B Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
%C Online
%D 2021
%I Association for Computational Linguistics
%K bert kg knowledge neuralnet nlp
%P 1772--1791
%T Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries
%U https://www.aclweb.org/anthology/2021.eacl-main.153
%X Pretrained language models have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose name is found in common LM vocabularies. Furthermore, a major benefit of this paradigm, i.e., querying the KB using natural language paraphrases, is underexplored. Here we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to handle millions of entities and present a detailed case study on paraphrased querying of facts stored in LMs, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.
@inproceedings{heinzerling2021language,
abstract = {Pretrained language models have been suggested as a possible alternative or complement to structured knowledge bases. However, this emerging LM-as-KB paradigm has so far only been considered in a very limited setting, which only allows handling 21k entities whose name is found in common LM vocabularies. Furthermore, a major benefit of this paradigm, i.e., querying the KB using natural language paraphrases, is underexplored. Here we formulate two basic requirements for treating LMs as KBs: (i) the ability to store a large number facts involving a large number of entities and (ii) the ability to query stored facts. We explore three entity representations that allow LMs to handle millions of entities and present a detailed case study on paraphrased querying of facts stored in LMs, thereby providing a proof-of-concept that language models can indeed serve as knowledge bases.},
added-at = {2021-05-03T21:28:16.000+0200},
address = {Online},
author = {Heinzerling, Benjamin and Inui, Kentaro},
biburl = {https://www.bibsonomy.org/bibtex/233332340251884df82650b571043224f/albinzehe},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume},
interhash = {ef25d1be34def6441ae8d054e7a0f989},
intrahash = {33332340251884df82650b571043224f},
keywords = {bert kg knowledge neuralnet nlp},
month = apr,
pages = {1772--1791},
publisher = {Association for Computational Linguistics},
timestamp = {2021-05-03T21:28:16.000+0200},
title = {Language Models as Knowledge Bases: On Entity Representations, Storage Capacity, and Paraphrased Queries},
url = {https://www.aclweb.org/anthology/2021.eacl-main.153},
year = 2021
}