Abstract
This work addresses the task of open world semantic segmentation using RGBD
sensing to discover new semantic classes over time. Although there are many
types of objects in the real-word, current semantic segmentation methods make a
closed world assumption and are trained only to segment a limited number of
object classes. Towards a more open world approach, we propose a novel method
that incrementally learns new classes for image segmentation. The proposed
system first segments each RGBD frame using both color and geometric
information, and then aggregates that information to build a single segmented
dense 3D map of the environment. The segmented 3D map representation is a key
component of our approach as it is used to discover new object classes by
identifying coherent regions in the 3D map that have no semantic label. The use
of coherent region in the 3D map as a primitive element, rather than
traditional elements such as surfels or voxels, also significantly reduces the
computational complexity and memory use of our method. It thus leads to
semi-real-time performance at 10.7Hz when incrementally updating the dense 3D
map at every frame. Through experiments on the NYUDv2 dataset, we demonstrate
that the proposed method is able to correctly cluster objects of both known and
unseen classes. We also show the quantitative comparison with the
state-of-the-art supervised methods, the processing time of each step, and the
influences of each component.
Users
Please
log in to take part in the discussion (add own reviews or comments).