Author of the publication

M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval.

, , , , , and . SIGIR, page 2156-2166. ACM, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval., , , , , and . SIGIR, page 2156-2166. ACM, (2024)HOTVCOM: Generating Buzzworthy Comments for Videos., , , , , and . ACL (Findings), page 2198-2224. Association for Computational Linguistics, (2024)Automatic Car Damage Assessment System: Reading and Understanding Videos as Professional Insurance Inspectors., , , , , , , , , and . AAAI, page 13646-13647. AAAI Press, (2020)CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset., , , , , and . CVPR, page 14815-14824. IEEE, (2023)LPSNet: A Lightweight Solution for Fast Panoptic Segmentation., , , , and . CVPR, page 16746-16754. Computer Vision Foundation / IEEE, (2021)Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input., , and . ECCV (36), volume 13696 of Lecture Notes in Computer Science, page 330-346. Springer, (2022)Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs., , , and . CoRR, (2023)Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval., , , , , , , , , and 1 other author(s). CVPR, page 15201-15210. IEEE, (2023)Hummer: Towards Limited Competitive Preference Dataset., , , , , , , , and . CoRR, (2024)SHE-Net: Syntax-Hierarchy-Enhanced Text-Video Retrieval., , , , , and . CoRR, (2024)