Author of the publication

HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections.

, , , , and . CoRR, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Rethinking search: making domain experts out of dilettantes., , , and . SIGIR Forum, 55 (1): 13:1-13:27 (2021)Transformer Memory as a Differentiable Search Index., , , , , , , , , and 3 other author(s). NeurIPS, (2022)UL2: Unifying Language Learning Paradigms., , , , , , , , , and 3 other author(s). ICLR, OpenReview.net, (2023)Reverse Engineering Configurations of Neural Text Generation Models., , , , , and . ACL, page 275-279. Association for Computational Linguistics, (2020)Synthesizer: Rethinking Self-Attention for Transformer Models., , , , , and . ICML, volume 139 of Proceedings of Machine Learning Research, page 10183-10192. PMLR, (2021)Sparse Sinkhorn Attention., , , , and . ICML, volume 119 of Proceedings of Machine Learning Research, page 9438-9447. PMLR, (2020)Deep k-NN for Noisy Labels., , and . ICML, volume 119 of Proceedings of Machine Learning Research, page 540-550. PMLR, (2020)Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection., , , and . CoRR, (2021)StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling., , , , , and . CoRR, (2020)Surprise: Result List Truncation via Extreme Value Theory., , , , and . SIGIR, page 2404-2408. ACM, (2023)