As we move towards large-scale object detection, it is unrealistic to expect
annotated training data for all object classes at sufficient scale, and so
methods capable of unseen object detection are required. We propose a novel
zero-shot method based on training an end-to-end model that fuses semantic
attribute prediction with visual features to propose object bounding boxes for
seen and unseen classes. While we utilize semantic features during training,
our method is agnostic to semantic information for unseen classes at test-time.
Our method retains the efficiency and effectiveness of YOLO for objects seen
during training, while improving its performance for novel and unseen objects.
The ability of state-of-art detection methods to learn discriminative object
features to reject background proposals also limits their performance for
unseen objects.
We posit that, to detect unseen objects, we must incorporate semantic
information into the visual domain so that the learned visual features reflect
this information and leads to improved recall rates for unseen objects. We test
our method on PASCAL VOC and MS COCO dataset and observed significant
improvements on the average precision of unseen classes.
%0 Generic
%1 citeulike:14556464
%A xxx,
%D 2018
%K arch detection loss semisup ssd zero\_shot
%T Zero-Shot Detection
%U http://arxiv.org/abs/1803.07113
%X As we move towards large-scale object detection, it is unrealistic to expect
annotated training data for all object classes at sufficient scale, and so
methods capable of unseen object detection are required. We propose a novel
zero-shot method based on training an end-to-end model that fuses semantic
attribute prediction with visual features to propose object bounding boxes for
seen and unseen classes. While we utilize semantic features during training,
our method is agnostic to semantic information for unseen classes at test-time.
Our method retains the efficiency and effectiveness of YOLO for objects seen
during training, while improving its performance for novel and unseen objects.
The ability of state-of-art detection methods to learn discriminative object
features to reject background proposals also limits their performance for
unseen objects.
We posit that, to detect unseen objects, we must incorporate semantic
information into the visual domain so that the learned visual features reflect
this information and leads to improved recall rates for unseen objects. We test
our method on PASCAL VOC and MS COCO dataset and observed significant
improvements on the average precision of unseen classes.
@misc{citeulike:14556464,
abstract = {{As we move towards large-scale object detection, it is unrealistic to expect
annotated training data for all object classes at sufficient scale, and so
methods capable of unseen object detection are required. We propose a novel
zero-shot method based on training an end-to-end model that fuses semantic
attribute prediction with visual features to propose object bounding boxes for
seen and unseen classes. While we utilize semantic features during training,
our method is agnostic to semantic information for unseen classes at test-time.
Our method retains the efficiency and effectiveness of YOLO for objects seen
during training, while improving its performance for novel and unseen objects.
The ability of state-of-art detection methods to learn discriminative object
features to reject background proposals also limits their performance for
unseen objects.
We posit that, to detect unseen objects, we must incorporate semantic
information into the visual domain so that the learned visual features reflect
this information and leads to improved recall rates for unseen objects. We test
our method on PASCAL VOC and MS COCO dataset and observed significant
improvements on the average precision of unseen classes.}},
added-at = {2019-02-27T22:23:29.000+0100},
archiveprefix = {arXiv},
author = {xxx},
biburl = {https://www.bibsonomy.org/bibtex/25591b13471515036041d583a001cc6f4/nmatsuk},
citeulike-article-id = {14556464},
citeulike-linkout-0 = {http://arxiv.org/abs/1803.07113},
citeulike-linkout-1 = {http://arxiv.org/pdf/1803.07113},
day = 19,
eprint = {1803.07113},
interhash = {0fb55f67524428bd7cab616e6c140b0c},
intrahash = {5591b13471515036041d583a001cc6f4},
keywords = {arch detection loss semisup ssd zero\_shot},
month = mar,
posted-at = {2018-03-26 16:57:51},
priority = {0},
timestamp = {2019-02-27T22:23:29.000+0100},
title = {{Zero-Shot Detection}},
url = {http://arxiv.org/abs/1803.07113},
year = 2018
}