Author of the publication

Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction.

, , and . CoNLL, page 97-107. Association for Computational Linguistics, (2018)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Token-level and sequence-level loss smoothing for RNN language models., , and . ACL (1), page 2094-2103. Association for Computational Linguistics, (2018)Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity., , , , and . EMNLP (Findings), page 12858-12870. Association for Computational Linguistics, (2023)Joint source-target encoding with pervasive attention., , and . Mach. Transl., 35 (4): 637-659 (2021)Added Toxicity Mitigation at Inference Time for Multimodal and Massively Multilingual Translation., , , and . CoRR, (2023)Causes and Cures for Interference in Multilingual Translation., , , , and . ACL (1), page 15849-15863. Association for Computational Linguistics, (2023)Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction., , and . CoNLL, page 97-107. Association for Computational Linguistics, (2018)Depth-Adaptive Transformer., , , and . ICLR, OpenReview.net, (2020)Seamless: Multilingual Expressive and Streaming Speech Translation., , , , , , , , , and 54 other author(s). CoRR, (2023)Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation., , and . CoRR, (2022)Efficient Wait-k Models for Simultaneous Machine Translation., , and . INTERSPEECH, page 1461-1465. ISCA, (2020)