Misc,

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

A. Wang, Y. Pruksachatkun, N. Nangia, A. Singh, J. Michael, F. Hill, O. Levy, and S. Bowman.
(2019)cite arxiv:1905.00537Comment: NeurIPS 2019, super.gluebenchmark.com updating acknowledegments.

Abstract

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced a little over one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently surpassed the level of non-expert humans, suggesting limited headroom for further research. In this paper we present SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, a software toolkit, and a public leaderboard. SuperGLUE is available at super.gluebenchmark.com.

BibTeX key: wang2019superglue
entry type: misc
year: 2019
url: http://arxiv.org/abs/1905.00537
note: cite arxiv:1905.00537Comment: NeurIPS 2019, super.gluebenchmark.com updating acknowledegments

BibSonomy

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on