From post

MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering.

, , , и . EMNLP (Findings), том EMNLP 2020 из Findings of ACL, стр. 4648-4660. Association for Computational Linguistics, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Learning a Multi-concept Video Retrieval Model with Multiple Latent Variables., , и . ISM, стр. 615-620. IEEE Computer Society, (2016)Video Generation from Text Employing Latent Path Construction for Temporal Modeling., и . CoRR, (2021)WoundNet: A Domain-Adaptable Few-Shot Classification Framework for Wound Healing Assessment., , , , , , , , , и 1 other автор(ы). ISBI, стр. 1-5. IEEE, (2023)UCF-CRCV at TRECVID 2015: Semantic Indexing., , , и . TRECVID, National Institute of Standards and Technology (NIST), (2015)MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering., , , и . EMNLP (Findings), том EMNLP 2020 из Findings of ACL, стр. 4648-4660. Association for Computational Linguistics, (2020)Pay Attention! - Robustifying a Deep Visuomotor Policy Through Task-Focused Visual Attention., , , и . CVPR, стр. 4254-4262. Computer Vision Foundation / IEEE, (2019)Visual Text Correction., и . ECCV (13), том 11217 из Lecture Notes in Computer Science, стр. 159-175. Springer, (2018)Context-Aware Analysis of Group Submissions for Group Anomaly Detection and Performance Prediction., и . AAAI, стр. 15938-15946. AAAI Press, (2023)Video Generation from Text Employing Latent Path Construction for Temporal Modeling., и . ICPR, стр. 5010-5016. IEEE, (2022)Deep Photo Cropper And Enhancer., , , и . ICIP, стр. 993-997. IEEE, (2020)