AnyFace: Free-style Text-to-Face Synthesis and Manipulation

Type: Article

Publication Date: 2022-06-01

Citations: 31

DOI: https://doi.org/10.1109/cvpr52688.2022.01813

View Chat PDF

Abstract

Existing text-to-image synthesis methods generally are only applicable to words in the training dataset. However, human faces are so variable to be described with limited words. So this paper proposes the first free-style text-to-face method namely AnyFace enabling much wider open world applications such as metaverse, social media, cosmetics, forensics, etc. AnyFace has a novel two-stream framework for face image synthesis and manipulation given arbitrary descriptions of the human face. Specifically, one stream performs text-to-face generation and the other conducts face image reconstruction. Facial text and image features are extracted using the CLIP (Contrastive Language-Image Pre-training) encoders. And a collaborative Cross Modal Distillation (CMD) module is designed to align the linguistic and visual features across these two streams. Furthermore, a Diverse Triplet Loss (DT loss) is developed to model fine-grained features and improve facial diversity. Extensive experiments on Multi-modal CelebA-HQ and CelebAText-HQ demonstrate significant advantages of AnyFace over state-of-the-art methods. AnyFace can achieve high-quality, high-resolution, and high-diversity face synthesis and manipulation results without any constraints on the number and content of input captions.

Locations

  • arXiv (Cornell University) - View - PDF
  • 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - View

Similar Works

Action Title Year Authors
+ AnyFace: Free-style Text-to-Face Synthesis and Manipulation 2022 Jianxin Sun
Qiyao Deng
Qi Li
Muyi Sun
Min Ren
Zhenan Sun
+ PDF Chat TediGAN: Text-Guided Diverse Face Image Generation and Manipulation 2020 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
+ TediGAN: Text-Guided Diverse Face Image Generation and Manipulation 2020 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
+ Towards Open-World Text-Guided Face Image Generation and Manipulation 2021 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
+ PDF Chat Faces Ă  la Carte: Text-to-Face Generation via Attribute Disentanglement 2021 Tianren Wang
Teng Zhang
Brian C. Lovell
+ PDF Chat TediGAN: Text-Guided Diverse Face Image Generation and Manipulation 2021 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
+ Faces Ă  la Carte: Text-to-Face Generation via Attribute Disentanglement 2020 Tianren Wang
Teng Zhang
Brian C. Lovell
+ Multi-Attributed and Structured Text-to-Face Synthesis 2021 Rohan Wadhawan
Tanuj Drall
Shubham Singh
Shampa Chakraverty
+ PDF Chat Multi-Attributed and Structured Text-to-Face Synthesis 2020 Rohan Wadhawan
Tanuj Drall
Shubham Singh
Shampa Chakraverty
+ PDF Chat Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images 2023 Cuican Yu
Guansong Lu
Yihan Zeng
Jian Sun
Xiaodan Liang
Huibin Li
Zongben Xu
Songcen Xu
Wei Zhang
Hang Xu
+ Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images 2023 Cuican Yu
Guansong Lu
Yihan Zeng
Jian Sun
Xiaodan Liang
Huibin Li
Zongben Xu
Songcen Xu
Wei Zhang
Hang Xu
+ TextCLIP: Text-Guided Face Image Generation And Manipulation Without Adversarial Training 2023 Xiaozhou You
J. Andrew Zhang
+ Text-to-Face Generation with StyleGAN2 2022 D. M. A. Ayanthi
Sarasi Munasinghe
+ PDF Chat Controllable 3D Face Generation with Conditional Style Code Diffusion 2024 Shen Xiao-long
Jianxin Ma
Chang Zhou
Zongxin Yang
+ FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation. 2019 Xiang Chen
Lingbo Qing
Xiaohai He
Xiaodong Luo
Yining Xu
+ FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation 2019 Xiang Chen
Lingbo Qing
He Xiaohai
Xiaodong Luo
Yining Xu
+ Controllable 3D Face Generation with Conditional Style Code Diffusion 2023 Shen Xiao-long
Jianxin Ma
Chang Zhou
Zongxin Yang
+ Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions 2019 Osaid Rehman Nasir
Shailesh Kumar Jha
Manraj Singh Grover
Yi Yu
Ajit Kumar
Rajiv Ratn Shah
+ Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions 2019 Osaid Rehman Nasir
Shailesh Kumar Jha
Manraj Singh Grover
Yi Yu
Ajit Kumar
Rajiv Ratn Shah
+ PDF Chat Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization 2024 Jinlu Zhang
Yiyi Zhou
Qiancheng Zheng
Xiaoxiong Du
Gen Luo
Jun Peng
Xiaoshuai Sun
Rongrong Ji

Cited by (13)

Action Title Year Authors
+ PDF Chat Text-Guided Face Recognition using Multi-Granularity Cross-Modal Contrastive Learning 2024 Md Mahedi Hasan
Shoaib Meraj Sami
Nasser M. Nasrabadi
+ PDF Chat Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder 2023 Xinmiao Lin
Yikang Li
Jen-Hao Hsiao
Chiuman Ho
Yu Kong
+ PDF Chat GAN-Based Facial Attribute Manipulation 2023 Yunfan Liu
Qi Li
Qiyao Deng
Zhenan Sun
Ming–Hsuan Yang
+ PDF Chat High-fidelity 3D Face Generation from Natural Language Descriptions 2023 Menghua Wu
Hao Zhu
Linjia Huang
Yiyu Zhuang
Yuanxun Lu
Xun Cao
+ Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model 2023 Xiaolin Chen
Xuemeng Song
Liqiang Jing
Shuo Li
Linmei Hu
Liqiang Nie
+ PDF Chat Pluralistic Aging Diffusion Autoencoder 2023 Peipei Li
Rui Wang
Huaibo Huang
Ran He
Zhaofeng He
+ PDF Chat Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images 2023 Cuican Yu
Guansong Lu
Yihan Zeng
Jian Sun
Xiaodan Liang
Huibin Li
Zongben Xu
Songcen Xu
Wei Zhang
Hang Xu
+ PDF Chat Text-Guided Eyeglasses Manipulation With Spatial Constraints 2023 Jiacheng Wang
Ping Liu
Jingen Liu
Wei Xu
+ PDF Chat Improving Face Recognition from Caption Supervision with Multi-Granular Contextual Feature Aggregation 2023 Md Mahedi Hasan
Nasser M. Nasrabadi
+ PDF Chat Vision + Language Applications: A Survey 2023 Yutong Zhou
Nobutaka Shimada
+ PDF Chat Collaborative Diffusion for Multi-Modal Face Generation and Editing 2023 Ziqi Huang
Kelvin C. K. Chan
Yuming Jiang
Ziwei Liu
+ PDF Chat CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics 2023 Yiren Song
Xuning Shao
Kang Chen
Weidong Zhang
Zhongliang Jing
Minzhe Li
+ PDF Chat Text2Performer: Text-Driven Human Video Generation 2023 Yuming Jiang
Shuai Yang
Tong Liang Koh
Wayne Wu
Chen Change Loy
Ziwei Liu

Citing (31)

Action Title Year Authors
+ Distilling the Knowledge in a Neural Network 2015 Geoffrey E. Hinton
Oriol Vinyals
Jay B. Dean
+ PDF Chat Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization 2017 Xun Huang
Serge Belongie
+ Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language 2018 Seonghyeon Nam
Yunji Kim
Seon Joo Kim
+ PDF Chat A Style-Based Generator Architecture for Generative Adversarial Networks 2019 Tero Karras
Samuli Laine
Timo Aila
+ PDF Chat The Unreasonable Effectiveness of Deep Features as a Perceptual Metric 2018 Richard Zhang
Phillip Isola
Alexei A. Efros
Eli Shechtman
Oliver Wang
+ PDF Chat StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks 2018 Han Zhang
Tao Xu
Hongsheng Li
Shaoting Zhang
Xiaogang Wang
Xiaolei Huang
Dimitris Metaxas
+ PDF Chat AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks 2018 Tao Xu
Pengchuan Zhang
Qiuyuan Huang
Han Zhang
Zhe Gan
Xiaolei Huang
Xiaodong He
+ PDF Chat StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks 2017 Han Zhang
Tao Xu
Hongsheng Li
Shaoting Zhang
Xiaogang Wang
Xiaolei Huang
Dimitris Metaxas
+ PDF Chat Semantic Image Synthesis via Adversarial Learning 2017 Hao Dong
Simiao Yu
Chao Wu
Yike Guo
+ PDF Chat DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis 2019 Minfeng Zhu
Pingbo Pan
Wei Chen
Yi Yang
+ PDF Chat ArcFace: Additive Angular Margin Loss for Deep Face Recognition 2019 Jiankang Deng
Jia Guo
Niannan Xue
Stefanos Zafeiriou
+ Controllable Text-to-Image Generation 2019 Bowen Li
Xiaojuan Qi
Thomas Lukasiewicz
Philip H. S. Torr
+ PDF Chat Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions 2019 Osaid Rehman Nasir
Shailesh Kumar Jha
Manraj Singh Grover
Yi Yu
Ajit Kumar
Rajiv Ratn Shah
+ PDF Chat CookGAN: Meal Image Synthesis from Ingredients 2020 Fangda Han
Ricardo Guerrero
Vladimir Pavlović
+ PDF Chat MaskGAN: Towards Diverse and Interactive Facial Image Manipulation 2020 Cheng‐Han Lee
Ziwei Liu
Ling‐Yun Wu
Ping Luo
+ Training Generative Adversarial Networks with Limited Data 2020 Tero Karras
Miika Aittala
Janne Hellsten
Samuli Laine
Jaakko Lehtinen
Timo Aila
+ Reference Guided Face Component Editing 2020 Qiyao Deng
Jie Cao
Yunfan Liu
Zhenhua Chai
Qi Li
Zhenan Sun
+ PDF Chat ManiGAN: Text-Guided Image Manipulation 2020 Bowen Li
Xiaojuan Qi
Thomas Lukasiewicz
Philip H. S. Torr
+ PDF Chat Analyzing and Improving the Image Quality of StyleGAN 2020 Tero Karras
Samuli Laine
Miika Aittala
Janne Hellsten
Jaakko Lehtinen
Timo Aila
+ NVAE: A Deep Hierarchical Variational Autoencoder 2020 Arash Vahdat
Jan Kautz
+ DF-GAN: Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis 2020 Ming Tao
Hao Tang
Songsong Wu
Nicu Sebe
Fei Wu
Xiao‐Yuan Jing
+ Towards Open-World Text-Guided Face Image Generation and Manipulation 2021 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
+ Learning Transferable Visual Models From Natural Language Supervision 2021 Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
+ PDF Chat One Shot Face Swapping on Megapixels 2021 Yuhao Zhu
Qi Li
Jian Wang
Cheng‐Zhong Xu
Zhenan Sun
+ PDF Chat TediGAN: Text-Guided Diverse Face Image Generation and Manipulation 2021 Weihao Xia
Yujiu Yang
Jing‐Hao Xue
Baoyuan Wu
+ PDF Chat Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation 2021 Elad Richardson
Yuval Alaluf
Or Patashnik
Yotam Nitzan
Yaniv Azar
Stav Shapiro
Daniel Cohen‐Or
+ PDF Chat Designing an encoder for StyleGAN image manipulation 2021 Omer Tov
Yuval Alaluf
Yotam Nitzan
Or Patashnik
Daniel Cohen‐Or
+ PDF Chat Cycle-Consistent Inverse GAN for Text-to-Image Synthesis 2021 Hao Wang
Guosheng Lin
Steven C. H. Hoi
Chunyan Miao
+ PDF Chat StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery 2021 Or Patashnik
Zongze Wu
Eli Shechtman
Daniel Cohen‐Or
Dani Lischinski
+ GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium 2017 Martin Heusel
Hubert Ramsauer
Thomas Unterthiner
Bernhard Nessler
Sepp Hochreiter
+ Conditional Image Generation and Manipulation for User-Specified Content 2020 David Stap
Maurits Bleeker
Sarah Ibrahimi
Maartje ter Hoeve