Author of the publication

Multi-Speaker Video Dialog with Frame-Level Temporal Localization.

, , , , and . AAAI, page 12200-12207. AAAI Press, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Learning Max-Margin GeoSocial Multimedia Network Representations for Point-of-Interest Suggestion., , , , , , and . SIGIR, page 833-836. ACM, (2017)Two-Order Deep Learning for Generalized Synthesis of Radiation Patterns for Antenna Arrays., , , , , and . IEEE Trans. Artif. Intell., 4 (5): 1359-1368 (October 2023)Efficient location-based search of trajectories with location importance., , , and . Knowl. Inf. Syst., 45 (1): 215-245 (2015)User Preference Learning for Online Social Recommendation., , , , and . IEEE Trans. Knowl. Data Eng., 28 (9): 2522-2534 (2016)TaoHighlight: Commodity-Aware Multi-Modal Video Highlight Detection in E-Commerce., , , , , and . IEEE Trans. Multim., (2022)Frame-Subtitle Self-Supervision for Multi-Modal Video Question Answering., , and . CoRR, (2022)Unsupervised Discovery of Interpretable Directions in h-space of Pre-trained Diffusion Models., , , , and . CoRR, (2023)Open-Vocabulary Object Detection With an Open Corpus., , , , , , and . ICCV, page 6736-6746. IEEE, (2023)Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR., , , , , and . ISCSLP, page 150-154. IEEE, (2022)Video Question Answering via Knowledge-based Progressive Spatial-Temporal Attention Network., , , , , and . TOMM, 15 (2s): 52:1-52:22 (2019)