Author of the publication

Video Question Answering with Spatio-Temporal Reasoning.

, , , , , and . Int. J. Comput. Vis., 127 (10): 1385-1412 (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Encoding Video and Label Priors for Multi-label Video Classification on YouTube-8M dataset., , , , and . CoRR, (2017)SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization., , , , , , , , , and 1 other author(s). CoRR, (2022)NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics., , , , , , , , , and 2 other author(s). CoRR, (2021)Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context., , and . Fig-Lang@ACL, page 12-17. Association for Computational Linguistics, (2020)ProsocialDialog: A Prosocial Backbone for Conversational Agents., , , , , , , and . EMNLP, page 4005-4029. Association for Computational Linguistics, (2022)Supervising Neural Attention Models for Video Captioning by Human Gaze Data., , , , , and . CVPR, page 6119-6127. IEEE Computer Society, (2017)Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms., , , , , , , and . EMNLP, page 894-914. Association for Computational Linguistics, (2023)Localized Symbolic Knowledge Distillation for Visual Commonsense Models., , , , , , , , , and 1 other author(s). NeurIPS, (2023)CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents., , , , , , and . IEEE Robotics Autom. Lett., 9 (2): 1059-1066 (February 2024)Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback., , , , and . CoRR, (2024)