High-Fidelity Audio Compression with Improved RVQGAN

Type: Preprint

Publication Date: 2023-01-01

Citations: 14

DOI: https://doi.org/10.48550/arxiv.2306.06546

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ High Fidelity Neural Audio Compression 2022 Alexandre DĆ©fossez
Jade Copet
Gabriel Synnaeve
Yossi Adi
+ Deep Neural Networks and End-to-End Learning for Audio Compression 2021 Daniela N. Rim
Inseon Jang
Heeyoul Choi
+ Deep Neural Networks and End-to-End Learning for Audio Compression 2021 Daniela N. Rim
Inseon Jang
Heeyoul Choi
+ SoundStream: An End-to-End Neural Audio Codec 2021 Neil Zeghidour
Alejandro Luebs
Ahmed Omran
Jan Skoglund
Marco Tagliasacchi
+ PDF Chat Gull: A Generative Multifunctional Audio Codec 2024 Yi Luo
Jianwei Yu
Hangting Chen
Rongzhi Gu
Chao Weng
+ Unified Signal Compression Using a GAN with Iterative Latent Representation Optimization 2021 Bowen Liu
Chang-Woo Lee
Ang Cao
Hun-Seok Kim
+ PDF Chat WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling 2024 Shengpeng Ji
Ziyue Karen Jiang
Xize Cheng
Y.W. Chen
Minghui Fang
Jialong Zuo
Qian Yang
Ruiqi Li
Ziang Zhang
Xiaoda Yang
+ PDF Chat SoundStream: An End-to-End Neural Audio Codec 2021 Neil Zeghidour
Alejandro Luebs
Ahmed S. Omran
Jan Skoglund
Marco Tagliasacchi
+ PDF Chat Unified Signal Compression Using Generative Adversarial Networks 2020 Bowen Liu
Ang Cao
Hun-Seok Kim
+ Unified Signal Compression Using Generative Adversarial Networks 2019 Bowen Liu
Ang Cao
Hunā€Seok Kim
+ PDF Chat MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios 2024 Xiao-Hang Jiang
Yang Ai
Rui-Chen Zheng
Hui-Peng Du
Ye-Xin Lu
Zhen-Hua Ling
+ PDF Chat Fre-GAN: Adversarial Frequency-consistent Audio Synthesis 2021 Ji-Hoon Kim
Sanghoon Lee
Jiā€Hyun Lee
Seongā€Whan Lee
+ PDF Chat Fre-GAN: Adversarial Frequency-Consistent Audio Synthesis 2021 Ji-Hoon Kim
Sanghoon Lee
Jiā€Hyun Lee
Seongā€Whan Lee
+ Fre-GAN: Adversarial Frequency-consistent Audio Synthesis 2021 Ji-Hoon Kim
Sanghoon Lee
Jiā€Hyun Lee
Seongā€Whan Lee
+ PDF Chat NDVQ: Robust Neural Audio Codec with Normal Distribution-Based Vector Quantization 2024 Zhikang Niu
Sanyuan Chen
Long Zhou
Ziyang Ma
Chen Xie
Shujie Liu
+ PDF Chat Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference 2024 Edresson Casanova
Ryan Langman
Paarth Neekhara
Shehzeen Hussain
Jason Li
Subhankar Ghosh
Ante Jukić
Sang-gil Lee
+ PDF Chat Low Bitrate High-Quality RVQGAN-based Discrete Speech Tokenizer 2024 Slava Shechtman
Avihu Dekel
+ PDF Chat Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models 2024 Dan Jacobellis
Daniel Cummings
Neeraja J. Yadwadkar
+ PDF Chat Learned Compression for Compressed Learning 2024 Dan Jacobellis
Neeraja J. Yadwadkar
+ HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec 2023 Dongchao Yang
Songxiang Liu
Rongjie Huang
Jinchuan Tian
Chao Weng
Yuexian Zou

Works That Cite This (11)

Action Title Year Authors
+ STEMGEN: A Music Generation Model That Listens 2024 Julian D. Parker
Janne Spijkervet
Katerina Kosta
Furkan Yesiler
Š‘Š¾Ń€Šøс Š. ŠšŃƒŠ·Š½ŠµŃ†Š¾Š²
Ju-Chiang Wang
Matt Avent
Jitong Chen
Duc Dung Le
+ Fewer-Token Neural Speech Codec with Time-Invariant Codes 2024 Yong Ren
Tao Wang
Jiangyan Yi
Le Xu
Jianhua Tao
Chu Yuan Zhang
Junzuo Zhou
+ Generative De-Quantization for Neural Speech Codec Via Latent Diffusion 2024 Haici Yang
Inseon Jang
Minje Kim
+ Personalized Neural Speech Codec 2024 Inseon Jang
Haici Yang
Wootaek Lım
Seungkwon Beack
Minje Kim
+ Adapting Frechet Audio Distance for Generative Music Evaluation 2024 Azalea Gui
Hannes A. Gamper
Sebastian Braun
Dimitra Emmanouilidou
+ FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec 2024 Zhihao Du
Shiliang Zhang
Kai Hu
Siqi Zheng
+ Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS 2024 Yifan Yang
Feiyu Shen
Chenpeng Du
Ziyang Ma
Kai Yu
Daniel Povey
Xie Chen
+ PDF Chat V2Meow: Meowing to the Visual Beat via Video-to-Music Generation 2024 Kun Su
Judith Yue Li
Qingqing Huang
Dima Kuzmin
Joonseok Lee
Chris Donahue
Fei Sha
Aren Jansen
Yu Wang
M. Verzetti
+ Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition 2024 Krishna C. Puvvada
Nithin Rao Koluguri
Kunal Dhawan
Jagadeesh Balam
Boris Ginsburg
+ PDF Chat Codec-Superb @ SLT 2024: A Lightweight Benchmark For Neural Audio Codec Models 2024 Haibin Wu
Xuanjun Chen
Yiā€Cheng Lin
Kaiā€Wei Chang
Jiawei Du
Ke-Han Lu
Alexander H. Liu
Ho-Lam Chung
Yuan-Kuei Wu
Dongchao Yang

Works Cited by This (0)

Action Title Year Authors