Image description DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer

1University of Oxford, 2Australian National University, 3University of Copenhagen
mainimg

DreamBeast generates fantastical 3D animal assets composed of distinct parts.

TL;DR

  • We introduce DreamBeast, a new method for generating part-aware 3D asset efficiently.
  • Utilize a novel part-aware knowledge transfer mechanism.
  • Efficiently extract part-level knowledge from Stable Diffusion 3 into a Part-Affinity NeRF for instant generation from various camera views.
  • Render Part-Affinity maps from the Part-Affinity NeRF and modulate a multi-view diffusion model during score distillation sampling (SDS).
  • Improves the part-awareness and quality of generated 3D creatures with efficient computational costs.

3D Fantastical Animal Generation

Abstract

We present DreamBeast, a novel method based on score distillation sampling (SDS) for generating fantastical 3D animal assets composed of distinct parts.

Existing SDS methods often struggle with this generation task due to a limited understanding of part-level semantics in text-to-image diffusion models. While recent diffusion models, such as Stable Diffusion 3, demonstrate a better part-level understanding, they are prohibitively slow and exhibit other common problems associated with single-view diffusion models. DreamBeast overcomes this limitation through a novel part-aware knowledge transfer mechanism. For each generated asset, we efficiently extract part-level knowledge from the Stable Diffusion 3 model into a 3D part-affinity implicit representation. This enables us to instantly generate part-affinity maps from arbitrary camera views, which we then use to modulate the guidance of a multi-view diffusion model during SDS to generate 3D assets of fantastical animals.

DreamBeast significantly enhances the quality of generated 3D creatures with user-specified part compositions while reducing computational overhead, as demonstrated by extensive quantitative and qualitative evaluations.

Pipeline

mainimg

    1. Partially optimize a NeRF using standard SDS.
    2. Render multiple views of the partially optimized NeRF and input them into SD3 with a text prompt to create Part-Affinity maps via cross-attention.
    3. Train a Part-Affinity NeRF using these extracted maps.
    4. Freeze the optimized Part-Affinity NeRF; render both 3D asset NeRF and Part-Affinity NeRF from the same camera pose. And use the rendered Part-Affinity map to modulate cross and self-attention in MVDream, generating a part-aware 3D animal.

Part-Affinity Map

Results

mainimg

Comparison with baseline methods.

mainimg

Part-Affinity maps visualization.

Non-animal Part-aware Generation

mainimg mainimg mainimg

BibTeX

@misc{li2024dreambeastdistilling3dfantastical,
      title={DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer}, 
      author={Runjia Li and Junlin Han and Luke Melas-Kyriazi and Chunyi Sun and Zhaochong An and Zhongrui Gui and Shuyang Sun and Philip Torr and Tomas Jakab},
      year={2024},
      eprint={2409.08271},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2409.08271}, 
}