DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

ICLR 2024
Jingxiang Sun 1,   Bo Zhang 3,   Ruizhi Shao 1,   Lizhen Wang 1,   Wen Liu 2,   Zhenda Xie 2,   Yebin Liu 1

1 Tsinghua University

,

2 DeepSeek AI

,

3 Independent Researcher

Abstract

We present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent 3D objects. We tackle the problem by leveraging a 2D reference image to guide the stages of geometry sculpting and texture boosting. A central focus of this work is to address the consistency issue that existing works encounter. To sculpt geometries that render coherently, we perform score distillation sampling via a view-dependent diffusion model. This 3D prior, alongside several training strategies, prioritizes the geometry consistency but compromises the texture fidelity. We further propose Bootstrapped Score Distillation (BSD) to specifically boost the texture. We train a personalized diffusion model, Dreambooth, on the augmented renderings of the scene, imbuing it with 3D knowledge of the scene being optimized. The score distillation from this 3D-aware diffusion prior provides view-consistent guidance for the scene. Notably, through an alternating optimization of the diffusion prior and 3D scene representation, we achieve mutually reinforcing improvements: the optimized 3D scene aids in training the scene-specific diffusion model, which offers increasingly view-consistent guidance for 3D optimization. The optimization is thus bootstrapped and leads to substantial texture boosting. With tailored 3D priors throughout the hierarchical generation, DreamCraft3D generates coherent 3D objects with photorealistic renderings, advancing the state-of-the-art in 3D content generation.

Generated Textured Meshes

Humoristic san goku body mixed with wild boar head running, amazing high tech fitness room digital illustration

3D CGI Pixar Lionel Messi artfully kicking paint-filled bottles

Portrait painting of batman with black leather armor, ultra realistic, concept art

A blue jay standing on a large basket of rainbow macarons

A DSLR photo of a corgi wearing a beret and holding a baguette, standing up on two hind legs

Isometric view of a MINI cute hyperrealistic futuristic soldier cat wearing cyberpunk jacket. orange skin.

More Results

Method Overview

DreamCraft3D leverages a 2D image generated from the text prompt and uses it to guide the stages of geometry sculpting and texture boosting. When sculpting the geometry, the view- conditioned diffusion model provides crucial 3D guidance to ensure geometric consistency. We then dedicately improve the texture quality by conducting a cyclic optimization. We augment the multi-view renderings and use them to finetune a diffusion model, DreamBooth, to offer multi-view consistent gradients to optimize the scene. We term the loss that distills from an evolving diffusion prior as bootstrapped distillation sampling.

BibTeX

@article{sun2023dreamcraft3d,
  title={Dreamcraft3d: Hierarchical 3d generation with bootstrapped diffusion prior},
  author={Sun, Jingxiang and Zhang, Bo and Shao, Ruizhi and Wang, Lizhen and Liu, Wen and Xie, Zhenda and Liu, Yebin},
  journal={arXiv preprint arXiv:2310.16818},
  year={2023}
}