Article,

Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers.

Z. Li, C. You, S. Bhojanapalli, D. Li, A. Rawat, S. Reddi, K. Ye, F. Chern, F. Yu, R. Guo, and S. Kumar.
CoRR, (2022)

Meta data

BibTeX key: journals/corr/abs-2210-06313
entry type: article
year: 2022
journal: CoRR
volume: abs/2210.06313
ee: https://doi.org/10.48550/arXiv.2210.06313
url: http://dblp.uni-trier.de/db/journals/corr/corr2210.html#abs-2210-06313

Tags

dblp

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

search on