Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Type: Preprint

Publication Date: 2024-09-17

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2409.11406

View Chat PDF

Abstract

In 3D modeling, designers often use an existing 3D model as a reference to create new ones. This practice has inspired the development of Phidias, a novel generative model that uses diffusion for reference-augmented 3D generation. Given an image, our method leverages a retrieved or user-provided 3D reference model to guide the generation process, thereby enhancing the generation quality, generalization ability, and controllability. Our model integrates three key components: 1) meta-ControlNet that dynamically modulates the conditioning strength, 2) dynamic reference routing that mitigates misalignment between the input image and 3D reference, and 3) self-reference augmentations that enable self-supervised training with a progressive curriculum. Collectively, these designs result in a clear improvement over existing methods. Phidias establishes a unified framework for 3D generation using text, image, and 3D conditions with versatile applications.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ PDF Chat Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation 2024 Xianghui Yang
Huiwen Shi
B. Zhang
Fan Yang
Jiacheng Wang
Hongxu Zhao
Xinhai Liu
Xinzhou Wang
Qingxiang Lin
Jiaao Yu
+ PDF Chat A Survey On Text-to-3D Contents Generation In The Wild 2024 Chenhan Jiang
+ PDF Chat Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation 2024 Fangfu Liu
Hanyang Wang
Weiliang Chen
Haowen Sun
Yueqi Duan
+ PDF Chat DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation 2024 Junkai Yan
Yipeng Gao
Qize Yang
Xihan Wei
Xuansong Xie
Ancong Wu
Wei‐Shi Zheng
+ PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion 2023 Ying-Tian Liu
Guan Luo
Heyi Sun
Yin Wei
Yuan-Chen Guo
Song–Hai Zhang
+ PDF Chat Advances in 3D Generation: A Survey 2024 Xiaoyu Li
Qi Zhang
Di Kang
Weihao Cheng
Yiming Gao
Jingbo Zhang
Zhihao Liang
Jing Liao
Yan–Pei Cao
Ying Shan
+ PDF Chat A Comprehensive Survey on 3D Content Generation 2024 Jian Liu
Xiaoshui Huang
Tianyu Huang
Lu Chen
Yuenan Hou
Shixiang Tang
Ziwei Liu
Wanli Ouyang
Wangmeng Zuo
Junjun Jiang
+ PDF Chat Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation 2024 Seungwook Kim
Yichun Shi
Kejie Li
Minsu Cho
Peng Wang
+ PDF Chat MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation 2024 Zehuan Huang
Yuan–Chen Guo
Xin An
Yunhan Yang
Yangguang Li
Zi–Xin Zou
Liang Ding
Xihui Liu
Yan–Pei Cao
Lu Sheng
+ PDF Chat Generating Compositional Scenes via Text-to-image RGBA Instance Generation 2024 Alessandro Fontanella
Petru-Daniel Tudosiu
Yongxin Yang
Shifeng Zhang
Sarah Parisot
+ A Unified Approach for Text- and Image-guided 4D Scene Generation 2023 Yufeng Zheng
Xueting Li
Koki Nagano
Sifei Liu
Karsten Kreis
Otmar Hilliges
Shalini De Mello
+ Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion 2023 Yuanxun Lu
Jingyang Zhang
Shiwei Li
Tian Fang
David McKinnon
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
+ ControlDreamer: Stylized 3D Generation with Multi-View ControlNet 2023 Yeongtak Oh
Jooyoung Choi
Yongsung Kim
Minjun Park
Chaehun Shin
Sungroh Yoon
+ PDF Chat MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion 2024 Dongseok Shim
Yichun Shi
Kejie Li
H. Jin Kim
Peng Wang
+ PDF Chat Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation 2024 Yuanbo Yang
Jiahao Shao
Xinyang Li
Yujun Shen
Andreas C. Geiger
Yiyi Liao
+ MVDream: Multi-view Diffusion for 3D Generation 2023 Yichun Shi
Peng Wang
Jianglong Ye
Long Mai
Kejie Li
Xiao Yang
+ PDF Chat Learning Continuous 3D Words for Text-to-Image Generation 2024 Ta-Ying Cheng
Matheus Gadelha
Thibault Groueix
Matthew Fisher
Radomír Měch
Andrew Markham
Niki Trigoni
+ PDF Chat Any-to-3D Generation via Hybrid Diffusion Supervision 2024 Yijun Fan
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Rongrong Ji
+ PDF Chat DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data 2024 Qihao Liu
Yi Zhang
Song Bai
Adam Kortylewski
Alan Yuille
+ EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior 2023 Minda Zhao
Chaoyi Zhao
Xinyue Liang
Lincheng Li
Zeng Zhao
Zhipeng Hu
Changjie Fan
Yu Xin

Cited by (0)

Action Title Year Authors

Citing (0)

Action Title Year Authors