How Much Knowledge Can You Pack Into the Parameters of a Language Model?
A. Roberts, C. Raffel, and N. Shazeer. (2020)cite arxiv:2002.08910Comment: Added results using "salient span masking" (Guu et al, 2020), achieving new state of the art on open domain WebQuestions and TriviaQA.
Abstract
It has recently been observed that neural language models trained on
unstructured text can implicitly store and retrieve knowledge using natural
language queries. In this short paper, we measure the practical utility of this
approach by fine-tuning pre-trained models to answer questions without access
to any external context or knowledge. We show that this approach scales
surprisingly well with model size and outperforms models that explicitly look
up knowledge on the open-domain variants of Natural Questions and WebQuestions.
To facilitate reproducibility and future work, we release our code and trained
models.
Description
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
cite arxiv:2002.08910Comment: Added results using "salient span masking" (Guu et al, 2020), achieving new state of the art on open domain WebQuestions and TriviaQA
%0 Generic
%1 roberts2020knowledge
%A Roberts, Adam
%A Raffel, Colin
%A Shazeer, Noam
%D 2020
%K languagemodel test
%T How Much Knowledge Can You Pack Into the Parameters of a Language Model?
%U http://arxiv.org/abs/2002.08910
%X It has recently been observed that neural language models trained on
unstructured text can implicitly store and retrieve knowledge using natural
language queries. In this short paper, we measure the practical utility of this
approach by fine-tuning pre-trained models to answer questions without access
to any external context or knowledge. We show that this approach scales
surprisingly well with model size and outperforms models that explicitly look
up knowledge on the open-domain variants of Natural Questions and WebQuestions.
To facilitate reproducibility and future work, we release our code and trained
models.
@misc{roberts2020knowledge,
abstract = {It has recently been observed that neural language models trained on
unstructured text can implicitly store and retrieve knowledge using natural
language queries. In this short paper, we measure the practical utility of this
approach by fine-tuning pre-trained models to answer questions without access
to any external context or knowledge. We show that this approach scales
surprisingly well with model size and outperforms models that explicitly look
up knowledge on the open-domain variants of Natural Questions and WebQuestions.
To facilitate reproducibility and future work, we release our code and trained
models.},
added-at = {2020-05-23T14:17:51.000+0200},
author = {Roberts, Adam and Raffel, Colin and Shazeer, Noam},
biburl = {https://www.bibsonomy.org/bibtex/2a6f4be549d6fa0a9e3f7804cf6fc69c6/snobbymullet},
description = {How Much Knowledge Can You Pack Into the Parameters of a Language Model?},
interhash = {565904760492bfa74f08192546537eec},
intrahash = {a6f4be549d6fa0a9e3f7804cf6fc69c6},
keywords = {languagemodel test},
note = {cite arxiv:2002.08910Comment: Added results using "salient span masking" (Guu et al, 2020), achieving new state of the art on open domain WebQuestions and TriviaQA},
timestamp = {2020-05-23T14:17:51.000+0200},
title = {How Much Knowledge Can You Pack Into the Parameters of a Language Model?},
url = {http://arxiv.org/abs/2002.08910},
year = 2020
}