Author of the publication

Deep Transformers with Latent Depth

, , , and . Advances in Neural Information Processing Systems, 33, page 1736--1746. Curran Associates, Inc., (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

BLT: Bidirectional Layout Transformer for Controllable Layout Generation., , , , , , and . ECCV (17), volume 13677 of Lecture Notes in Computer Science, page 474-490. Springer, (2022)Generalized Data Augmentation for Low-Resource Translation., , , and . ACL (1), page 5786-5796. Association for Computational Linguistics, (2019)Mega: Moving Average Equipped Gated Attention., , , , , , , and . CoRR, (2022)Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints., , and . SLTU, volume 81 of Procedia Computer Science, page 30-36. Elsevier, (2016)Incorporating a Local Translation Mechanism into Non-autoregressive Translation., , and . EMNLP (1), page 1067-1073. Association for Computational Linguistics, (2020)Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation., , , , , , , and . ACL (1), page 9688-9712. Association for Computational Linguistics, (2024)Dynamic routing with endpoint admission control for V61P networks., and . ICC, page 1728-1732. IEEE, (2003)Exploring the dynamics of the disaggregated intercity corporate network in the Yangtze River Delta, China: a relational event approach., , , and . J. Geogr. Syst., 24 (1): 115-140 (2022)Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders., , , , , and . EACL, page 1613-1624. Association for Computational Linguistics, (2021)Large Language Model-guided Document Selection., , and . CoRR, (2024)