AnyFace: Free-style Text-to-Face Synthesis and Manipulation

Jianxin Sun, Qiyao Deng, Qi Li, Muyi Sun, Min Ren, Zhenan Sun

Type: Article

Publication Date: 2022-06-01

Citations: 31

DOI: https://doi.org/10.1109/cvpr52688.2022.01813

View Chat PDF

Abstract

Existing text-to-image synthesis methods generally are only applicable to words in the training dataset. However, human faces are so variable to be described with limited words. So this paper proposes the first free-style text-to-face method namely AnyFace enabling much wider open world applications such as metaverse, social media, cosmetics, forensics, etc. AnyFace has a novel two-stream framework for face image synthesis and manipulation given arbitrary descriptions of the human face. Specifically, one stream performs text-to-face generation and the other conducts face image reconstruction. Facial text and image features are extracted using the CLIP (Contrastive Language-Image Pre-training) encoders. And a collaborative Cross Modal Distillation (CMD) module is designed to align the linguistic and visual features across these two streams. Furthermore, a Diverse Triplet Loss (DT loss) is developed to model fine-grained features and improve facial diversity. Extensive experiments on Multi-modal CelebA-HQ and CelebAText-HQ demonstrate significant advantages of AnyFace over state-of-the-art methods. AnyFace can achieve high-quality, high-resolution, and high-diversity face synthesis and manipulation results without any constraints on the number and content of input captions.

Locations

arXiv (Cornell University) - View - PDF
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - View

Similar Works

Action	Title	Year	Authors
+	AnyFace: Free-style Text-to-Face Synthesis and Manipulation	2022	Jianxin Sun Qiyao Deng Qi Li Muyi Sun Min Ren Zhenan Sun
+ PDF Chat	TediGAN: Text-Guided Diverse Face Image Generation and Manipulation	2020	Weihao Xia Yujiu Yang Jing‐Hao Xue Baoyuan Wu
+	TediGAN: Text-Guided Diverse Face Image Generation and Manipulation	2020	Weihao Xia Yujiu Yang Jing‐Hao Xue Baoyuan Wu
+	Towards Open-World Text-Guided Face Image Generation and Manipulation	2021	Weihao Xia Yujiu Yang Jing‐Hao Xue Baoyuan Wu
+ PDF Chat	Faces à la Carte: Text-to-Face Generation via Attribute Disentanglement	2021	Tianren Wang Teng Zhang Brian C. Lovell
+ PDF Chat	TediGAN: Text-Guided Diverse Face Image Generation and Manipulation	2021	Weihao Xia Yujiu Yang Jing‐Hao Xue Baoyuan Wu
+	Faces à la Carte: Text-to-Face Generation via Attribute Disentanglement	2020	Tianren Wang Teng Zhang Brian C. Lovell
+	Multi-Attributed and Structured Text-to-Face Synthesis	2021	Rohan Wadhawan Tanuj Drall Shubham Singh Shampa Chakraverty
+ PDF Chat	Multi-Attributed and Structured Text-to-Face Synthesis	2020	Rohan Wadhawan Tanuj Drall Shubham Singh Shampa Chakraverty
+ PDF Chat	Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images	2023	Cuican Yu Guansong Lu Yihan Zeng Jian Sun Xiaodan Liang Huibin Li Zongben Xu Songcen Xu Wei Zhang Hang Xu
+	Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images	2023	Cuican Yu Guansong Lu Yihan Zeng Jian Sun Xiaodan Liang Huibin Li Zongben Xu Songcen Xu Wei Zhang Hang Xu
+	TextCLIP: Text-Guided Face Image Generation And Manipulation Without Adversarial Training	2023	Xiaozhou You J. Andrew Zhang
+	Text-to-Face Generation with StyleGAN2	2022	D. M. A. Ayanthi Sarasi Munasinghe
+ PDF Chat	Controllable 3D Face Generation with Conditional Style Code Diffusion	2024	Shen Xiao-long Jianxin Ma Chang Zhou Zongxin Yang
+	FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation.	2019	Xiang Chen Lingbo Qing Xiaohai He Xiaodong Luo Yining Xu
+	FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation	2019	Xiang Chen Lingbo Qing He Xiaohai Xiaodong Luo Yining Xu
+	Controllable 3D Face Generation with Conditional Style Code Diffusion	2023	Shen Xiao-long Jianxin Ma Chang Zhou Zongxin Yang
+	Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions	2019	Osaid Rehman Nasir Shailesh Kumar Jha Manraj Singh Grover Yi Yu Ajit Kumar Rajiv Ratn Shah
+	Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions	2019	Osaid Rehman Nasir Shailesh Kumar Jha Manraj Singh Grover Yi Yu Ajit Kumar Rajiv Ratn Shah
+ PDF Chat	Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization	2024	Jinlu Zhang Yiyi Zhou Qiancheng Zheng Xiaoxiong Du Gen Luo Jun Peng Xiaoshuai Sun Rongrong Ji

Cited by (13)

Action	Title	Year	Authors
+ PDF Chat	Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning	2024	Md Mahedi Hasan Shoaib Meraj Sami Nasser M. Nasrabadi
+ PDF Chat	Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder	2023	Xinmiao Lin Yikang Li Jen-Hao Hsiao Chiuman Ho Yu Kong
+ PDF Chat	GAN-Based Facial Attribute Manipulation	2023	Yunfan Liu Qi Li Qiyao Deng Zhenan Sun Ming–Hsuan Yang
+ PDF Chat	High-fidelity 3D Face Generation from Natural Language Descriptions	2023	Menghua Wu Hao Zhu Linjia Huang Yiyu Zhuang Yuanxun Lu Xun Cao
+	Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model	2023	Xiaolin Chen Xuemeng Song Liqiang Jing Shuo Li Linmei Hu Liqiang Nie
+ PDF Chat	Pluralistic Aging Diffusion Autoencoder	2023	Peipei Li Rui Wang Huaibo Huang Ran He Zhaofeng He
+ PDF Chat	Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images	2023	Cuican Yu Guansong Lu Yihan Zeng Jian Sun Xiaodan Liang Huibin Li Zongben Xu Songcen Xu Wei Zhang Hang Xu
+ PDF Chat	Text-Guided Eyeglasses Manipulation With Spatial Constraints	2023	Jiacheng Wang Ping Liu Jingen Liu Wei Xu
+ PDF Chat	Improving Face Recognition from Caption Supervision with Multi-Granular Contextual Feature Aggregation	2023	Md Mahedi Hasan Nasser M. Nasrabadi
+ PDF Chat	Vision + Language Applications: A Survey	2023	Yutong Zhou Nobutaka Shimada
+ PDF Chat	Collaborative Diffusion for Multi-Modal Face Generation and Editing	2023	Ziqi Huang Kelvin C. K. Chan Yuming Jiang Ziwei Liu
+ PDF Chat	CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics	2023	Yiren Song Xuning Shao Kang Chen Weidong Zhang Zhongliang Jing Minzhe Li
+ PDF Chat	Text2Performer: Text-Driven Human Video Generation	2023	Yuming Jiang Shuai Yang Tong Liang Koh Wayne Wu Chen Change Loy Ziwei Liu

Citing (31)

Action	Title	Year	Authors
+	Distilling the Knowledge in a Neural Network	2015	Geoffrey E. Hinton Oriol Vinyals Jay B. Dean
+ PDF Chat	Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization	2017	Xun Huang Serge Belongie
+	Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language	2018	Seonghyeon Nam Yunji Kim Seon Joo Kim
+ PDF Chat	A Style-Based Generator Architecture for Generative Adversarial Networks	2019	Tero Karras Samuli Laine Timo Aila
+ PDF Chat	The Unreasonable Effectiveness of Deep Features as a Perceptual Metric	2018	Richard Zhang Phillip Isola Alexei A. Efros Eli Shechtman Oliver Wang
+ PDF Chat	StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks	2018	Han Zhang Tao Xu Hongsheng Li Shaoting Zhang Xiaogang Wang Xiaolei Huang Dimitris Metaxas
+ PDF Chat	AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks	2018	Tao Xu Pengchuan Zhang Qiuyuan Huang Han Zhang Zhe Gan Xiaolei Huang Xiaodong He
+ PDF Chat	StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks	2017	Han Zhang Tao Xu Hongsheng Li Shaoting Zhang Xiaogang Wang Xiaolei Huang Dimitris Metaxas
+ PDF Chat	Semantic Image Synthesis via Adversarial Learning	2017	Hao Dong Simiao Yu Chao Wu Yike Guo
+ PDF Chat	DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis	2019	Minfeng Zhu Pingbo Pan Wei Chen Yi Yang
+ PDF Chat	ArcFace: Additive Angular Margin Loss for Deep Face Recognition	2019	Jiankang Deng Jia Guo Niannan Xue Stefanos Zafeiriou
+	Controllable Text-to-Image Generation	2019	Bowen Li Xiaojuan Qi Thomas Lukasiewicz Philip H. S. Torr
+ PDF Chat	Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions	2019	Osaid Rehman Nasir Shailesh Kumar Jha Manraj Singh Grover Yi Yu Ajit Kumar Rajiv Ratn Shah
+ PDF Chat	CookGAN: Meal Image Synthesis from Ingredients	2020	Fangda Han Ricardo Guerrero Vladimir Pavlović
+ PDF Chat	MaskGAN: Towards Diverse and Interactive Facial Image Manipulation	2020	Cheng‐Han Lee Ziwei Liu Ling‐Yun Wu Ping Luo
+	Training Generative Adversarial Networks with Limited Data	2020	Tero Karras Miika Aittala Janne Hellsten Samuli Laine Jaakko Lehtinen Timo Aila
+	Reference Guided Face Component Editing	2020	Qiyao Deng Jie Cao Yunfan Liu Zhenhua Chai Qi Li Zhenan Sun
+ PDF Chat	ManiGAN: Text-Guided Image Manipulation	2020	Bowen Li Xiaojuan Qi Thomas Lukasiewicz Philip H. S. Torr
+ PDF Chat	Analyzing and Improving the Image Quality of StyleGAN	2020	Tero Karras Samuli Laine Miika Aittala Janne Hellsten Jaakko Lehtinen Timo Aila
+	NVAE: A Deep Hierarchical Variational Autoencoder	2020	Arash Vahdat Jan Kautz
+	DF-GAN: Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis	2020	Ming Tao Hao Tang Songsong Wu Nicu Sebe Fei Wu Xiao‐Yuan Jing
+	Towards Open-World Text-Guided Face Image Generation and Manipulation	2021	Weihao Xia Yujiu Yang Jing‐Hao Xue Baoyuan Wu
+	Learning Transferable Visual Models From Natural Language Supervision	2021	Alec Radford Jong Wook Kim Chris Hallacy Aditya Ramesh Gabriel Goh Sandhini Agarwal Girish Sastry Amanda Askell Pamela Mishkin Jack Clark
+ PDF Chat	One Shot Face Swapping on Megapixels	2021	Yuhao Zhu Qi Li Jian Wang Cheng‐Zhong Xu Zhenan Sun
+ PDF Chat	TediGAN: Text-Guided Diverse Face Image Generation and Manipulation	2021	Weihao Xia Yujiu Yang Jing‐Hao Xue Baoyuan Wu
+ PDF Chat	Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation	2021	Elad Richardson Yuval Alaluf Or Patashnik Yotam Nitzan Yaniv Azar Stav Shapiro Daniel Cohen‐Or
+ PDF Chat	Designing an encoder for StyleGAN image manipulation	2021	Omer Tov Yuval Alaluf Yotam Nitzan Or Patashnik Daniel Cohen‐Or
+ PDF Chat	Cycle-Consistent Inverse GAN for Text-to-Image Synthesis	2021	Hao Wang Guosheng Lin Steven C. H. Hoi Chunyan Miao
+ PDF Chat	StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery	2021	Or Patashnik Zongze Wu Eli Shechtman Daniel Cohen‐Or Dani Lischinski
+	GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium	2017	Martin Heusel Hubert Ramsauer Thomas Unterthiner Bernhard Nessler Sepp Hochreiter
+	Conditional Image Generation and Manipulation for User-Specified Content	2020	David Stap Maurits Bleeker Sarah Ibrahimi Maartje ter Hoeve