Abstract
Planning for robotic manipulation requires reasoning about the changes a
robot can affect on objects. When such interactions can be modelled
analytically, as in domains with rigid objects, efficient planning algorithms
exist. However, in both domestic and industrial domains, the objects of
interest can be soft, or deformable, and hard to model analytically. For such
cases, we posit that a data-driven modelling approach is more suitable. In
recent years, progress in deep generative models has produced methods that
learn to `imagine' plausible images from data. Building on the recent Causal
InfoGAN generative model, in this work we learn to imagine goal-directed object
manipulation directly from raw image data of self-supervised interaction of the
robot with the object. After learning, given a goal observation of the system,
our model can generate an imagined plan -- a sequence of images that transition
the object into the desired goal. To execute the plan, we use it as a reference
trajectory to track with a visual servoing controller, which we also learn from
the data as an inverse dynamics model. In a simulated manipulation task, we
show that separating the problem into visual planning and visual tracking
control is more sample efficient and more interpretable than alternative
data-driven approaches. We further demonstrate our approach on learning to
imagine and execute in 3 environments, the final of which is deformable rope
manipulation on a PR2 robot.
Description
Learning Robotic Manipulation through Visual Planning and Acting
Links and resources
Tags
community