NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Kim, Seung Wook; Brown, Bradley; Yin, Kangxue; Kreis, Karsten; Schwarz, Katja; Li, Daiqing; Rombach, Robin; Torralba, Antonio und Fidler, Sanja (2023): NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 17. - 24. Juni 2023. Brown, Michael S. (Hrsg.): In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Piscataway: IEEE. S. 8496-8506

Volltext auf 'Open Access LMU' nicht verfügbar.

DOI: 10.1109/CVPR52729.2023.00821

Abstract

Automatically generating high-quality real world 3D scenes is of enormous interest for applications such as virtual reality and robotics simulation. Towards this goal, we introduce NeuralField-LDM, a generative model capable of synthesizing complex 3D environments. We leverage Latent Diffusion Models that have been successfully utilized for efficient high-quality 2D content creation. We first train a scene auto-encoder to express a set of image and pose pairs as a neural field, represented as density and feature voxel grids that can be projected to produce novel views of the scene. To further compress this representation, we train a latent-autoencoder that maps the voxel grids to a set of latent representations. A hierarchical diffusion model is then fit to the latents to complete the scene generation pipeline. We achieve a substantial improvement over existing state-of-the-art scene generation models. Additionally, we show how NeuralField-LDM can be used for a variety of 3D content creation applications, including conditional scene generation, scene inpainting and scene style manipulation.

Dokumententyp:	Konferenzbeitrag (Paper)
Fakultät:	Mathematik, Informatik und Statistik
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
ISBN:	979-8-3503-0129-8 ; 979-8-3503-0130-4
Ort:	Piscataway
Sprache:	Englisch
Dokumenten ID:	123897
Datum der Veröffentlichung auf Open Access LMU:	17. Feb. 2025 11:27
Letzte Änderungen:	17. Feb. 2025 11:27

Dokument bearbeiten