Abstract
Object proposals have become an integral preprocessing steps of many vision
pipelines including object detection, weakly supervised detection, object
discovery, tracking, etc. Compared to the learning-free methods, learning-based
proposals have become popular recently due to the growing interest in object
detection. The common paradigm is to learn object proposals from data labeled
with a set of object regions and their corresponding categories. However, this
approach often struggles with novel objects in the open world that are absent
in the training set. In this paper, we identify that the problem is that the
binary classifiers in existing proposal methods tend to overfit to the training
categories. Therefore, we propose a classification-free Object Localization
Network (OLN) which estimates the objectness of each region purely by how well
the location and shape of a region overlap with any ground-truth object (e.g.,
centerness and IoU). This simple strategy learns generalizable objectness and
outperforms existing proposals on cross-category generalization on COCO, as
well as cross-dataset evaluation on RoboNet, Object365, and EpicKitchens.
Finally, we demonstrate the merit of OLN for long-tail object detection on
large vocabulary dataset, LVIS, where we notice clear improvement in rare and
common categories.
Users
Please
log in to take part in the discussion (add own reviews or comments).