Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

LongNet: Scaling Transformers to 1,000,000,000 Tokens, , , , , , and . (2023)cite arxiv:2307.02486Comment: Work in progress.LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding., , , , , , , , , and 2 other author(s). CoRR, (2020)DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders., , , , , , , , and . CoRR, (2021)BEiT: BERT Pre-Training of Image Transformers., , and . CoRR, (2021)Kformer: Knowledge Injection in Transformer Feed-Forward Layers., , , , , and . NLPCC (1), volume 13551 of Lecture Notes in Computer Science, page 131-143. Springer, (2022)DiT: Self-supervised Pre-training for Document Image Transformer., , , , , and . ACM Multimedia, page 3530-3539. ACM, (2022)UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data., , , , , , , and . ICML, volume 139 of Proceedings of Machine Learning Research, page 10937-10947. PMLR, (2021)Attention Temperature Matters in Abstractive Summarization Distillation., , , and . ACL (1), page 127-141. Association for Computational Linguistics, (2022)MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding., , , and . ACL (1), page 6078-6087. Association for Computational Linguistics, (2022)Memory-Efficient Differentiable Transformer Architecture Search., , , , , and . ACL/IJCNLP (Findings), volume ACL/IJCNLP 2021 of Findings of ACL, page 4254-4264. Association for Computational Linguistics, (2021)