Yuhta Takida

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models 2025 Zerui Tao
Yuhta Takida
Naoki Murata
Qibin Zhao
Yuki Mitsufuji
+ PDF Chat TraSCE: Trajectory Steering for Concept Erasure 2024 Anubhav Jain
Yuya Kobayashi
Takashi Shibuya
Yuhta Takida
Nasir Memon
Julian Togelius
Yuki Mitsufuji
+ PDF Chat Classifier-Free Guidance inside the Attraction Basin May Cause Memorization 2024 Anubhav Jain
Yuya Kobayashi
Takashi Shibuya
Yuhta Takida
Nasir Memon
Julian Togelius
Yuki Mitsufuji
+ PDF Chat Mitigating Embedding Collapse in Diffusion Models for Categorical Data 2024 Nguyen Hoang Bac
Anna Chiara Lai
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Stefano Ermon
Yuki Mitsufuji
+ PDF Chat Distillation of Discrete Diffusion through Dimensional Correlations 2024 Satoshi Hayakawa
Yuhta Takida
Masaaki Imaizumi
Hiromi Wakaki
Yuki Mitsufuji
+ PDF Chat $\textit{Jump Your Steps}$: Optimizing Sampling Schedule of Discrete Diffusion Models 2024 Yong-Hyun Park
Chieh-Hsin Lai
Satoshi Hayakawa
Yuhta Takida
Yuki Mitsufuji
+ PDF Chat DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation 2024 Yin-Jyun Luo
Kin Wai Cheuk
Woosung Choi
Toshimitsu Uesaka
Keisuke Toyama
Koichi Saito
Chieh-Hsin Lai
Yuhta Takida
Wei‐Hsiang Liao
Simon Dixon
+ PDF Chat MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training 2024 Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation 2024 Koichi Saito
D. S. Kim
Takashi Shibuya
Chieh-Hsin Lai
Zhi Zhong
Yuhta Takida
Yuki Mitsufuji
+ PDF Chat PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher 2024 Dongjun Kim
Chieh-Hsin Lai
Wei‐Hsiang Liao
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Yuki Mitsufuji
Stefano Ermon
+ BIGVSAN: Enhancing Gan-Based Neural Vocoders with Slicing Adversarial Network 2024 Takashi Shibuya
Yuhta Takida
Yuki Mitsufuji
+ PDF Chat Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement 2023 Ryosuke Sawata
Naoki Murata
Yuhta Takida
Toshimitsu Uesaka
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ Unsupervised Vocal Dereverberation with Diffusion-Based Generative Models 2023 Koichi Saito
Naoki Murata
Toshimitsu Uesaka
Chieh-Hsin Lai
Yuhta Takida
Takao Fukui
Yuki Mitsufuji
+ GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration 2023 Naoki Murata
Koichi Saito
Chieh-Hsin Lai
Yuhta Takida
Toshimitsu Uesaka
Yuki Mitsufuji
Stefano Ermon
+ SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer 2023 Yuhta Takida
Masaaki Imaizumi
Chieh-Hsin Lai
Toshimitsu Uesaka
Naoki Murata
Yuki Mitsufuji
+ On the Equivalence of Consistency-Type Models: Consistency Models, Consistent Diffusion Models, and Fokker-Planck Regularization 2023 Chieh-Hsin Lai
Yuhta Takida
Toshimitsu Uesaka
Naoki Murata
Yuki Mitsufuji
Stefano Ermon
+ Automatic Piano Transcription with Hierarchical Frequency-Time Transformer 2023 Keisuke Toyama
Taketo Akama
Yukara Ikemiya
Yuhta Takida
Wei‐Hsiang Liao
Yuki Mitsufuji
+ BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network 2023 Takashi Shibuya
Yuhta Takida
Yuki Mitsufuji
+ Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion 2023 Dongjun Kim
Chieh-Hsin Lai
Wei‐Hsiang Liao
Naoki Murata
Yuhta Takida
Toshimitsu Uesaka
Yutong He
Yuki Mitsufuji
Stefano Ermon
+ On the Language Encoder of Contrastive Cross-modal Models 2023 Mengjie Zhao
Junya Ono
Zhi Zhong
Chieh-Hsin Lai
Yuhta Takida
Naoki Murata
Wei‐Hsiang Liao
Takashi Shibuya
Hiromi Wakaki
Yuki Mitsufuji
+ Manifold Preserving Guided Diffusion 2023 Yutong He
Naoki Murata
Chieh-Hsin Lai
Yuhta Takida
Toshimitsu Uesaka
Dong‐Jun Kim
Wei‐Hsiang Liao
Yuki Mitsufuji
J. Zico Kolter
Ruslan Salakhutdinov
+ SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization 2022 Yuhta Takida
Takashi Shibuya
WeiHsiang Liao
Chieh-Hsin Lai
Junki Ohmura
Toshimitsu Uesaka
Naoki Murata
Shusuke Takahashi
Toshiyuki Kumakura
Yuki Mitsufuji
+ FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation 2022 Chieh-Hsin Lai
Yuhta Takida
Naoki Murata
Toshimitsu Uesaka
Yuki Mitsufuji
Stefano Ermon
+ Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement 2022 Ryosuke Sawata
Naoki Murata
Yuhta Takida
Toshimitsu Uesaka
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ Unsupervised vocal dereverberation with diffusion-based generative models 2022 Koichi Saito
Naoki Murata
Toshimitsu Uesaka
Chieh-Hsin Lai
Yuhta Takida
Takao Fukui
Yuki Mitsufuji
+ Preventing Oversmoothing in VAE via Generalized Variance Parameterization 2021 Yuhta Takida
Wei‐Hsiang Liao
Chieh-Hsin Lai
Toshimitsu Uesaka
Shusuke Takahashi
Yuki Mitsufuji
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat SEGAN: Speech Enhancement Generative Adversarial Network 2017 Santiago Pascual
Antonio Bonafonte
Joan SerrĂ 
3
+ Denoising Diffusion Restoration Models 2022 Bahjat Kawar
Michael Elad
Stefano Ermon
Jiaming Song
2
+ Denoising Diffusion Probabilistic Models 2020 Jonathan Ho
Ajay N. Jain
Pieter Abbeel
2
+ Improved Techniques for Training Score-Based Generative Models 2020 Yang Song
Stefano Ermon
2
+ Denoising Diffusion Implicit Models 2020 Jiaming Song
Chenlin Meng
Stefano Ermon
2
+ Real-time Denoising and Dereverberation with Tiny Recurrent U-Net 2021 Hyeong-Seok Choi
Sungjin Park
Jie Hwan Lee
Hoon Heo
Dongsuk Jeon
Kyogu Lee
2
+ PDF Chat Restoring Degraded Speech via a Modified Diffusion Model 2021 Jianwei Zhang
Suren Jayasuriya
Visar Berisha
2
+ PDF Chat Fullsubnet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement 2021 Xiang Hao
Xiangdong Su
Radu Horaud
Xiaofei Li
2
+ Diffusion Models Beat GANs on Image Synthesis 2021 Prafulla Dhariwal
Alex Nichol
2
+ PDF Chat Perceptual Loss Based Speech Denoising with an Ensemble of Audio Pattern Recognition and Self-Supervised Models 2021 Saurabh Kataria
JesĂşs Villalba
Najim Dehak
2
+ Improved Denoising Diffusion Probabilistic Models 2021 Alexander Quinn Nichol
Prafulla Dhariwal
2
+ A Study on Speech Enhancement Based on Diffusion Probabilistic Model 2021 Yen-Ju Lu
Yu Tsao
Shinji Watanabe
2
+ PDF Chat Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement 2021 Tsun-An Hsieh
Cheng Yu
Szu‐Wei Fu
Xugang Lu
Yu Tsao
2
+ Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem 2021 Jing Shi
Xuankai Chang
Tomoki Hayashi
Yen‐Ju Lu
Shinji Watanabe
Bo Xu
2
+ PDF Chat Conditional Diffusion Probabilistic Model for Speech Enhancement 2022 Yen-Ju Lu
Zhong-Qiu Wang
Shinji Watanabe
Alexander Richard
Cheng Yu
Yu Tsao
2
+ PDF Chat Dnsmos P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors 2022 Chandan K. Reddy
Vishak Gopal
Ross Cutler
2
+ Universal Speech Enhancement with Score-based Diffusion 2022 Joan SerrĂ 
Santiago Pascual
Jordi Pons
Recep Oğuz Araz
Davide Scaini
2
+ PDF Chat Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain 2022 Simon Welker
Julius Richter
Timo Gerkmann
2
+ VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration 2022 Haohe Liu
Xubo Liu
Qiuqiang Kong
Qiao Tian
Yan Zhao
DeLiang Wang
Chuanzeng Huang
Yuxuan Wang
2
+ SEGAN: Speech Enhancement Generative Adversarial Network 2017 Santiago Pascual
Antonio Bonafonte
Joan SerrĂ 
2
+ Diffusion-Based Generative Speech Source Separation 2023 Robin Scheibler
Youna Ji
Soo-Whan Chung
Jaeuk Byun
Soyeon Choe
Min-Seok Choi
2
+ PDF Chat Speech Enhancement and Dereverberation With Diffusion-Based Generative Models 2023 Julius Richter
Simon Welker
Jean-Marie Lemercier
Bunlong Lay
Timo Gerkmann
2
+ PDF Chat DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score 2018 Yuma Koizumi
Kenta Niwa
Yusuke Hioka
Kazunori Kobayashi
Yoichi Haneda
2
+ PDF Chat Non-intrusive Speech Quality Assessment Using Neural Networks 2019 Anderson R. Avila
Hannes Gamper
Chandan K. Reddy
Ross Cutler
Ivan Tashev
Johannes Gehrke
2
+ MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement 2019 Szu‐Wei Fu
Chien-Feng Liao
Yu Tsao
Shou-De Lin
2
+ Phase-aware Speech Enhancement with Deep Complex U-Net 2019 Hyeong-Seok Choi
Jang-Hyun Kim
Jaesung Huh
Adrian Kim
Jung-Woo Ha
Kyogu Lee
2
+ Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation 2019 Yi Luo
Nima Mesgarani
2
+ PDF Chat SDR – Half-baked or Well Done? 2019 Jonathan Le Roux
Scott Wisdom
Hakan Erdoğan
John R. Hershey
2
+ PDF Chat Least Squares Generative Adversarial Networks 2017 Xudong Mao
Qing Li
Haoran Xie
Raymond Y.K. Lau
Zhen Wang
Stephen Paul Smolley
1
+ PDF Chat Generative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis 2017 Bajibabu Bollepalli
Lauri Juvela
Paavo Alku
1
+ PDF Chat Solving Linear Inverse Problems Using Gan Priors: An Algorithm with Provable Guarantees 2018 Viraj Shah
Chinmay Hegde
1
+ PDF Chat Vocbench: A Neural Vocoder Benchmark for Speech Synthesis 2022 Ehab A. AlBadawy
Andrew Gibiansky
Qing He
Jilong Wu
Ming-Ching Chang
Siwei Lyu
1
+ PDF Chat Investigating Range-Equalizing Bias in Mean Opinion Score Ratings of Synthesized Speech 2023 Erica Cooper
Junichi Yamagishi
1
+ PDF Chat Reverb Conversion Of Mixed Vocal Tracks Using An End-To-End Convolutional Deep Neural Network 2021 Junghyun Koo
Seungryeol Paik
Kyogu Lee
1
+ WaveNet: A Generative Model for Raw Audio 2016 Aäron van den Oord
Sander Dieleman
Heiga Zen
Karen Simonyan
Oriol Vinyals
Alexander Graves
Nal Kalchbrenner
Andrew Senior
Koray Kavukcuoglu
1
+ PDF Chat Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction 2022 Hyungjin Chung
Byeongsu Sim
Jong Chul Ye
1
+ PDF Chat A Style-Based Generator Architecture for Generative Adversarial Networks 2019 Tero Karras
Samuli Laine
Timo Aila
1
+ PDF Chat Waveglow: A Flow-based Generative Network for Speech Synthesis 2019 Ryan Prenger
Rafael Valle
Bryan Catanzaro
1
+ PDF Chat CNN architectures for large-scale audio classification 2017 Shawn Hershey
Sourish Chaudhuri
Daniel P. W. Ellis
Jort F. Gemmeke
Aren Jansen
Robert C. Moore
Manoj Plakal
Devin Platt
Rif A. Saurous
Bryan Seybold
1
+ PDF Chat NHSS: A speech and singing parallel database 2021 Bidisha Sharma
Xiaoxue Gao
Karthika Vijayan
Xiaohai Tian
Haizhou Li
1
+ PDF Chat Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions 2018 Jonathan Shen
Ruoming Pang
Ron J. Weiss
Mike Schuster
Navdeep Jaitly
Zongheng Yang
Zhifeng Chen
Yu Zhang
Yuxuan Wang
Rj Skerrv-Ryan
1
+ PDF Chat LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech 2019 Heiga Zen
Viet Chau Dang
Rob Clark
Zhang Yu
Ron J. Weiss
Jia Ye
Zhifeng Chen
Yonghui Wu
1
+ PDF Chat Parallel Wavegan: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram 2020 Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
1
+ Improving Diffusion Models for Inverse Problems using Manifold Constraints 2022 Hyungjin Chung
Byeongsu Sim
Dohoon Ryu
Jong Chul Ye
1
+ PDF Chat StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets 2022 Axel Sauer
Katja Schwarz
Andreas Geiger
1
+ SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer 2023 Yuhta Takida
Masaaki Imaizumi
Chieh-Hsin Lai
Toshimitsu Uesaka
Naoki Murata
Yuki Mitsufuji
1
+ Hierarchical Diffusion Models for Singing Voice Neural Vocoder 2023 Naoya Takahashi
Mayank Kumar
Singh
Yuki Mitsufuji
1
+ PDF Chat Music Enhancement via Image Translation and Vocoding 2022 Nikhil Kandpal
Oriol Nieto
Zeyu Jin
1