H. Li, Z. Xu, G. Taylor, C. Studer, and T. Goldstein. (2017)cite arxiv:1712.09913Comment: NIPS 2018 (extended version, 10.5 pages), code is available at https://github.com/tomgoldstein/loss-landscape.
S. Merity. (2019)cite arxiv:1911.11423Comment: Addition of citations and contextual results (no attention head, single attention head, attention per layer), removal of wordpiece WikiText-103 numbers due to normalization issues, fix of SHA attention figure Q arrow, other minor fixes.
A. Clauset, C. Shalizi, and M. Newman. (2007)cite arxiv:0706.1062Comment: 43 pages, 11 figures, 7 tables, 4 appendices; code available at http://www.santafe.edu/~aaronc/powerlaws/.