Author of the publication

Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning.

, , and . CVPR, page 2214-2224. IEEE, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Learning Differentiable Grammars for Continuous Data., , and . CoRR, (2019)4D-Net for Learned Multi-Modal Alignment., , , and . ICCV, page 15415-15425. IEEE, (2021)AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification., , , , , , , and . ECCV (8), volume 12353 of Lecture Notes in Computer Science, page 449-465. Springer, (2020)PaLI: A Jointly-Scaled Multilingual Language-Image Model., , , , , , , , , and 10 other author(s). ICLR, OpenReview.net, (2023)AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures., , , and . ICLR, OpenReview.net, (2020)Diversifying Joint Vision-Language Tokenization Learning., , and . CoRR, (2023)Unsupervised Action Segmentation for Instructional Videos., , , and . CoRR, (2021)Compound Tokens: Channel Fusion for Vision-Language Representation Learning., and . CoRR, (2022)Evolving Losses for Unlabeled Video Representation Learning., , and . CoRR, (2019)AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures., , , and . CoRR, (2019)