Abstract
We study the problem of large scale, multi-label visual recognition with a
large number of possible classes. We propose a method for augmenting a trained
neural network classifier with auxiliary capacity in a manner designed to
significantly improve upon an already well-performing model, while minimally
impacting its computational footprint. Using the predictions of the network
itself as a descriptor for assessing visual similarity, we define a
partitioning of the label space into groups of visually similar entities. We
then augment the network with auxilliary hidden layer pathways with
connectivity only to these groups of label units. We report a significant
improvement in mean average precision on a large-scale object recognition task
with the augmented model, while increasing the number of multiply-adds by less
than 3%.
Users
Please
log in to take part in the discussion (add own reviews or comments).