Skip to yearly menu bar Skip to main content


Poster

Disentangled 3D Scene Generation with Layout Learning

Dave Epstein · Ben Poole · Ben Mildenhall · Alexei Efros · Aleksander Holynski


Abstract:

We introduce a method to generate 3D scenes that are disentangled into their component objects. This disentanglement is unsupervised, relying only on the knowledge of a large pretrained text-to-image model. Our key insight is that objects can be discovered by finding parts of a 3D scene that, when rearranged spatially, still produce valid configurations of the same scene. Concretely, our method jointly optimizes multiple NeRFs---each representing its own object---along with a set of layouts that composite these objects into scenes. We then encourage these composited scenes to be in-distribution according to the image generator. We show that despite its simplicity, our approach successfully generates 3D scenes decomposed into individual objects, enabling new capabilities in text-to-3D content creation.

Live content is unavailable. Log in and register to view live content