Author of the publication

Violin: A Large-Scale Dataset for Video-and-Language Inference.

, , , , , , and . CVPR, page 10897-10907. Computer Vision Foundation / IEEE, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Modeling Context in Referring Expressions., , , , and . ECCV (2), volume 9906 of Lecture Notes in Computer Science, page 69-85. Springer, (2016)GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval., , , , , and . ECCV (35), volume 13695 of Lecture Notes in Computer Science, page 709-725. Springer, (2022)TVQA+: Spatio-Temporal Grounding for Video Question Answering., , , and . CoRR, (2019)Assistive supernumerary grasping with the back of the hand., , , and . ICRA, page 6154-6160. IEEE, (2021)A Unified Framework for Manifold Landmarking., , , and . IEEE Trans. Signal Process., 66 (21): 5563-5576 (2018)Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models., , , , , and . ECCV (6), volume 12351 of Lecture Notes in Computer Science, page 565-580. Springer, (2020)Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations., , , , , and . CVPR, page 14825-14835. IEEE, (2023)FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks., , , , , and . CVPR, page 2669-2680. IEEE, (2023)Enable back memory and global synchronization on LLC buffer., , , , , and . J. Supercomput., 73 (12): 5414-5439 (2017)Physics-Inspired Garment Recovery from a Single-View Image., , , , , , and . ACM Trans. Graph., 37 (5): 170 (2018)