It is a research group under the Stanford Vision & Learning Lab that focuses on developing methods and mechanisms for generalizable robot perception and control.
We work on challenging open problems at the intersection of computer vision, machine learning, and robotics. We develop algorithms and systems that unify in reinforcement learning, control theoretic modeling, and 2D/3D visual scene understanding to teach robots to perceive and to interact with the physical world.
Gibson’s underlying database of spaces includes 572 full buildings composed of 1447 floors covering a total area of 211k m2s. The database is collected from real indoor spaces using 3D scanning and reconstruction. For each space, we provide: the 3D reconstruction, RGB images, depth, surface normal, and for a fraction of the spaces, semantic object annotations. In this page you can see various visualizations for each space, including 3D dissections, exploration using a randomly controlled husky agent, and standard point-to-point navigation episodes