Author of the publication

Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos.

, , , , and . NeurIPS, page 14476-14487. (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Conditional Generation of Audio from Video via Foley Analogies., , , , and . CVPR, page 2426-2436. IEEE, (2023)Telling Left From Right: Learning Spatial Correspondence of Sight and Sound., , and . CVPR, page 9929-9938. Computer Vision Foundation / IEEE, (2020)Contrastive Feature Loss for Image Prediction., , , , , and . ICCVW, page 1934-1943. IEEE, (2021)Look at What I'm Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos., , , , and . NeurIPS, page 14476-14487. (2021)Object Recognition by Scene Alignment, , , , and . Advances in Neural Information Processing Systems 20, MIT Press, Cambridge, MA, (2008)Editing Conditional Radiance Fields., , , , , and . ICCV, page 5753-5763. IEEE, (2021)It's Time for Artistic Correspondence in Music and Video., , , and . CVPR, page 10554-10564. IEEE, (2022)Language-Guided Audio-Visual Source Separation via Trimodal Consistency., , , , , , , and . CVPR, page 10575-10584. IEEE, (2023)Monocular Dynamic View Synthesis: A Reality Check., , , , and . NeurIPS, (2022)Koala: Key frame-conditioned long video-LLM., , , , , , , and . CoRR, (2024)