Abstract
Language Models like ELMo and BERT have provided robust representations of
natural language, which serve as the language understanding component for a
diverse range of downstream tasks.Curriculum learning is a method that employs
a structured training regime instead, which has been leveraged in computer
vision and machine translation to improve model training speed and model
performance. While language models have proven transformational for the natural
language processing community, these models have proven expensive,
energy-intensive, and challenging to train. In this work, we explore the effect
of curriculum learning on language model pretraining using various
linguistically motivated curricula and evaluate transfer performance on the
GLUE Benchmark. Despite a broad variety of training methodologies and
experiments we do not find compelling evidence that curriculum learning methods
improve language model training.
Users
Please
log in to take part in the discussion (add own reviews or comments).