Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection., , , , , , , and . EMNLP, page 4084-4096. Association for Computational Linguistics, (2022)Similarity Learning For Cover Song Identification Using Cross-Similarity Matrices of Multi-Level Deep Sequences., , and . ICASSP, page 26-30. IEEE, (2020)Hallucination Augmented Contrastive Learning for Multimodal Large Language Model., , , , , , , , , and . CoRR, (2023)BUS : Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization., , , , , , , , , and . ICCV, page 2888-2898. IEEE, (2023)Learn A Robust Representation For Cover Song Identification Via Aggregating Local And Global Music Temporal Context., , and . ICME, page 1-6. IEEE, (2020)Exploiting Pseudo Image Captions for Multimodal Summarization., , , , and . ACL (Findings), page 161-175. Association for Computational Linguistics, (2023)Vision Language Pre-training by Contrastive Learning with Cross-Modal Similarity Regulation., , , , , and . ACL (1), page 14660-14679. Association for Computational Linguistics, (2023)Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models., , , , , , , and . CoRR, (2024)COPA : Efficient Vision-Language Pre-training through Collaborative Object- and Patch-Text Alignment., , , , , , , , , and . ACM Multimedia, page 4480-4491. ACM, (2023)Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection., , , , , , , , and . CoRR, (2024)