copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

Q. Wu, C. Shen, P. Wang, A. Dick, and A. v. d. Hengel. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40 (6): 1367-1381 (June 2018)
DOI: 10.1109/TPAMI.2017.2708709

Abstract

Much of the recent progress in Vision-to-Language problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text. In this paper we first propose a method of incorporating high-level concepts into the successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art in both image captioning and visual question answering. We further show that the same mechanism can be used to incorporate external knowledge, which is critically important for answering high level visual questions. Specifically, we design a visual question answering model that combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions. It particularly allows questions to be asked where the image alone does not contain the information required to select the appropriate answer. Our final model achieves the best reported results for both image captioning and visual question answering on several of the major benchmark datasets.

Description

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge - IEEE Journals & Magazine

Links and resources

BibTeX key: wu2018image
entry type: article
year: 2018
month: June
journal: IEEE Transactions on Pattern Analysis and Machine Intelligence
number: 6
pages: 1367-1381
volume: 40
issn: 1939-3539
DOI: 10.1109/TPAMI.2017.2708709
url: https://ieeexplore.ieee.org/document/7934440

@nosebrain's tags highlighted

Cite this publication

search on

Meta data

Last update 4 years ago
Created 4 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

Comments and Reviews
(0)