Author of the publication

Learning Audio-Video Modalities from Image Captions.

, , , , , , and . ECCV (14), volume 13674 of Lecture Notes in Computer Science, page 407-426. Springer, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Look Before You Speak: Visually Contextualized Utterances., , and . CVPR, page 16877-16887. Computer Vision Foundation / IEEE, (2021)Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning., , , , , , , and . CVPR, page 10714-10726. IEEE, (2023)Regularizing Neural Networks via Stochastic Branch Layers., , , and . ACML, volume 101 of Proceedings of Machine Learning Research, page 678-693. PMLR, (2019)Reinforcing an Image Caption Generator Using Off-Line Human Feedback., , , , and . CoRR, (2019)Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction., , and . CoRR, (2015)MarioQA: Answering Questions by Watching Gameplay Videos., , , and . CoRR, (2016)Learning Correlation Structures for Vision Transformers., , , and . CoRR, (2024)AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR., , and . CVPR, page 22922-22931. IEEE, (2023)Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences., , and . CVPR, page 9030-9038. Computer Vision Foundation / IEEE, (2019)Reinforcing an Image Caption Generator Using Off-Line Human Feedback., , , , and . AAAI, page 2693-2700. AAAI Press, (2020)