Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.07 KB

2410.13530.md

File metadata and controls

5 lines (3 loc) · 2.07 KB

L3DG: Latent 3D Gaussian Diffusion

We propose L3DG, the first approach for generative 3D modeling of 3D Gaussians through a latent 3D Gaussian diffusion formulation. This enables effective generative 3D modeling, scaling to generation of entire room-scale scenes which can be very efficiently rendered. To enable effective synthesis of 3D Gaussians, we propose a latent diffusion formulation, operating in a compressed latent space of 3D Gaussians. This compressed latent space is learned by a vector-quantized variational autoencoder (VQ-VAE), for which we employ a sparse convolutional architecture to efficiently operate on room-scale scenes. This way, the complexity of the costly generation process via diffusion is substantially reduced, allowing higher detail on object-level generation, as well as scalability to large scenes. By leveraging the 3D Gaussian representation, the generated scenes can be rendered from arbitrary viewpoints in real-time. We demonstrate that our approach significantly improves visual quality over prior work on unconditional object-level radiance field synthesis and showcase its applicability to room-scale scene generation.

我们提出了L3DG,这是首个通过潜在3D高斯扩散公式进行3D高斯生成建模的方法。这使得高效的3D生成建模成为可能,扩展至生成整个房间规模的场景,并且这些场景可以非常高效地渲染。为实现3D高斯的有效合成,我们提出了一种在压缩的3D高斯潜在空间中运行的扩散公式。该压缩的潜在空间由向量量化变分自编码器(VQ-VAE)学习,我们采用稀疏卷积架构,以便高效处理房间规模的场景。通过这种方式,基于扩散的高成本生成过程的复杂性得到了大幅降低,使得在对象级生成中获得更高细节,同时能够扩展至大规模场景。通过利用3D高斯表示,生成的场景可以从任意视角实时渲染。我们展示了该方法在无条件对象级辐射场合成上的视觉质量相较于之前的工作有显著提升,并展示了其在房间规模场景生成中的适用性。