Shusuke Takahashi

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation 2024 Kazuki Shimada
Christian Simon
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond 2024 Marco ComunitĂ 
Zhi Zhong
Akira Takahashi
Shiqi Yang
Mengjie Zhao
Koichi Saito
Yukara Ikemiya
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training 2024 Kengo Uchida
Takashi Shibuya
Yuhta Takida
Naoki Murata
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation 2024 Shiqi Yang
Zhi Zhong
Mengjie Zhao
Shusuke Takahashi
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
+ Zero- and Few-Shot Sound Event Localization and Detection 2024 Kazuki Shimada
Kengo Uchida
Yuichiro Koyama
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
Tatsuya Kawahara
+ Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders 2024 Hao Shi
Kazuki Shimada
Masato Hirano
Takashi Shibuya
Yuichiro Koyama
Zhi Zhong
Shusuke Takahashi
Tatsuya Kawahara
Yuki Mitsufuji
+ PDF Chat Extending Audio Masked Autoencoders toward Audio Restoration 2023 Zhi Zhong
Hao Shi
Masato Hirano
Kazuki Shimada
Kazuya Tateishi
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement 2023 Ryosuke Sawata
Naoki Murata
Yuhta Takida
Toshimitsu Uesaka
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification 2023 Zhi Zhong
Masato Hirano
Kazuki Shimada
Kazuya Tateishi
Shusuke Takahashi
Yuki Mitsufuji
+ Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability 2023 Kin Wai Cheuk
Ryosuke Sawata
Toshimitsu Uesaka
Naoki Murata
Naoya Takahashi
Shusuke Takahashi
Dorien Herremans
Yuki Mitsufuji
+ An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification 2023 Zhi Zhong
Masato Hirano
Kazuki Shimada
Kazuya Tateishi
Shusuke Takahashi
Yuki Mitsufuji
+ Diffusion-based Signal Refiner for Speech Separation 2023 Masato Hirano
Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Yuki Mitsufuji
+ Extending Audio Masked Autoencoders Toward Audio Restoration 2023 Zhi Zhong
Hao Shi
Masato Hirano
Kazuki Shimada
Kazuya Tateishi
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation 2023 Ryosuke Sawata
Naoya Takahashi
Stefan Uhlich
Shusuke Takahashi
Yuki Mitsufuji
+ Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders 2023 Hao Shi
Kazuki Shimada
Masato Hirano
Takashi Shibuya
Yuichiro Koyama
Zhi Zhong
Shusuke Takahashi
Tatsuya Kawahara
Yuki Mitsufuji
+ STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events 2023 Kazuki Shimada
Archontis Politis
Parthasaarathy Sudarsanam
Daniel Krause
Kengo Uchida
Sharath Adavanne
Aapo Hakala
Yuichiro Koyama
Naoya Takahashi
Shusuke Takahashi
+ The Sound Demixing Challenge 2023 $\unicode{x2013}$ Cinematic Demixing Track 2023 Stefan Uhlich
Giorgio Fabbro
Masato Hirano
Shusuke Takahashi
Gordon Wichern
Jonathan Le Roux
Dipam Chakraborty
Sharada P. Mohanty
Kai Li
Yi Luo
+ Zero- and Few-shot Sound Event Localization and Detection 2023 Kazuki Shimada
Kengo Uchida
Yuichiro Koyama
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
Tatsuya Kawahara
+ PDF Chat Improving Character Error Rate is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-Box Acoustic Models 2022 Ryosuke Sawata
Yosuke Kashiwagi
Shusuke Takahashi
+ PDF Chat Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection 2022 Ricardo FalcĂłn PĂ©rez
Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat Music Source Separation With Deep Equilibrium Models 2022 Yuichiro Koyama
Naoki Murata
Stefan Uhlich
Giorgio Fabbro
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training 2022 Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Naoya Takahashi
Emiru Tsunoo
Yuki Mitsufuji
+ PDF Chat Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection 2022 Yuichiro Koyama
Kazuhide Shigemi
Masafumi Takahashi
Kazuki Shimada
Naoya Takahashi
Emiru Tsunoo
Shusuke Takahashi
Yuki Mitsufuji
+ Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training 2022 Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Naoya Takahashi
Emiru Tsunoo
Yuki Mitsufuji
+ SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization 2022 Yuhta Takida
Takashi Shibuya
WeiHsiang Liao
Chieh-Hsin Lai
Junki Ohmura
Toshimitsu Uesaka
Naoki Murata
Shusuke Takahashi
Toshiyuki Kumakura
Yuki Mitsufuji
+ STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events 2022 Archontis Politis
Kazuki Shimada
Parthasaarathy Sudarsanam
Sharath Adavanne
Daniel Krause
Yuichiro Koyama
Naoya Takahashi
Shusuke Takahashi
Yuki Mitsufuji
Tuomas Virtanen
+ DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability 2022 Kin Wai Cheuk
Ryosuke Sawata
Toshimitsu Uesaka
Naoki Murata
Naoya Takahashi
Shusuke Takahashi
Dorien Herremans
Yuki Mitsufuji
+ Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement 2022 Ryosuke Sawata
Naoki Murata
Yuhta Takida
Toshimitsu Uesaka
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat Manifold-Aware Deep Clustering: Maximizing Angles Between Embedding Vectors Based on Regular Simplex 2021 Keitaro Tanaka
Ryosuke Sawata
Shusuke Takahashi
+ PDF Chat Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex 2021 Keitaro Tanaka
Ryosuke Sawata
Shusuke Takahashi
+ PDF Chat All For One And One For All: Improving Music Separation By Bridging Networks 2021 Ryosuke Sawata
Stefan Uhlich
Shusuke Takahashi
Yuki Mitsufuji
+ PDF Chat Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection 2021 Kazuki Shimada
Yuichiro Koyama
Naoya Takahashi
Shusuke Takahashi
Yuki Mitsufuji
+ Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex 2021 Keitaro Tanaka
Ryosuke Sawata
Shusuke Takahashi
+ Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection 2021 Kazuki Shimada
Naoya Takahashi
Yuichiro Koyama
Shusuke Takahashi
Emiru Tsunoo
Masafumi Takahashi
Yuki Mitsufuji
+ Preventing Oversmoothing in VAE via Generalized Variance Parameterization 2021 Yuhta Takida
Wei‐Hsiang Liao
Chieh-Hsin Lai
Toshimitsu Uesaka
Shusuke Takahashi
Yuki Mitsufuji
+ Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models 2021 Ryosuke Sawata
Yosuke Kashiwagi
Shusuke Takahashi
+ Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training 2021 Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Naoya Takahashi
Emiru Tsunoo
Yuki Mitsufuji
+ Music Source Separation with Deep Equilibrium Models 2021 Yuichiro Koyama
Naoki Murata
Stefan Uhlich
Giorgio Fabbro
Shusuke Takahashi
Yuki Mitsufuji
+ Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection 2021 Yuichiro Koyama
Kazuhide Shigemi
Masafumi Takahashi
Kazuki Shimada
Naoya Takahashi
Emiru Tsunoo
Shusuke Takahashi
Yuki Mitsufuji
+ Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net 2020 Kazuki Shimada
Naoya Takahashi
Shusuke Takahashi
Yuki Mitsufuji
+ ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection 2020 Kazuki Shimada
Yuichiro Koyama
Naoya Takahashi
Shusuke Takahashi
Yuki Mitsufuji
+ All for One and One for All: Improving Music Separation by Bridging Networks 2020 Ryosuke Sawata
Stefan Uhlich
Shusuke Takahashi
Yuki Mitsufuji
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition 2019 Daniel Park
William Chan
Yu Zhang
Chung‐Cheng Chiu
Barret Zoph
Ekin D. Cubuk
Quoc V. Le
10
+ PDF Chat AENet: Learning Deep Audio Features for Video Analysis 2017 Naoya Takahashi
Michael Gygli
Luc Van Gool
9
+ First Order Ambisonics Domain Spatial Augmentation for DNN-based Direction of Arrival Estimation 2019 Luca Mazzon
Yuma Koizumi
Masahiro Yasuda
Noboru Harada
8
+ PDF Chat Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks 2018 Sharath Adavanne
Archontis Politis
Joonas Nikunen
Tuomas Virtanen
7
+ Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy 2019 Yin Cao
Qiuqiang Kong
Turab Iqbal
Fengyan An
Wenwu Wang
Mark D. Plumbley
7
+ A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection. 2020 Archontis Politis
Sharath Adavanne
Tuomas Virtanen
6
+ Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation 2019 Yi Luo
Nima Mesgarani
6
+ PDF Chat An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection 2021 Yin Cao
Turab Iqbal
Qiuqiang Kong
Fengyan An
Wenwu Wang
Mark D. Plumbley
5
+ PDF Chat Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019 2020 Archontis Politis
Annamaria Mesaros
Sharath Adavanne
Toni Heittola
Tuomas Virtanen
5
+ PDF Chat A Sequence Matching Network for Polyphonic Sound Event Localization and Detection 2020 Thi Ngoc Tho Nguyen
Douglas L. Jones
Woon‐Seng Gan
5
+ PDF Chat Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection 2021 Kazuki Shimada
Yuichiro Koyama
Naoya Takahashi
Shusuke Takahashi
Yuki Mitsufuji
5
+ PDF Chat SDR – Half-baked or Well Done? 2019 Jonathan Le Roux
Scott Wisdom
Hakan Erdoğan
John R. Hershey
5
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr DollĂĄr
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
4
+ Phase-aware Speech Enhancement with Deep Complex U-Net 2019 Hyeong-Seok Choi
Jang-Hyun Kim
Jaesung Huh
Adrian Kim
Jung-Woo Ha
Kyogu Lee
4
+ A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection 2020 Archontis Politis
Sharath Adavanne
Tuomas Virtanen
4
+ PDF Chat Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks 2017 Morten KolbĂŠk
Dong Yu
Zheng‐Hua Tan
Jesper Jensen
4
+ A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection 2021 Qing Wang
Jun Du
Huaxin Wu
Jia Pan
Feng Ma
Chin‐Hui Lee
4
+ PDF Chat Speech Enhancement and Dereverberation With Diffusion-Based Generative Models 2023 Julius Richter
Simon Welker
Jean-Marie Lemercier
Bunlong Lay
Timo Gerkmann
4
+ Universal Speech Enhancement with Score-based Diffusion 2022 Joan SerrĂ 
Santiago Pascual
Jordi Pons
Recep Oğuz Araz
Davide Scaini
4
+ A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection 2021 Archontis Politis
Sharath Adavanne
Daniel Krause
Antoine Deleforge
Prerak Srivastava
Tuomas Virtanen
4
+ Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection 2021 Kazuki Shimada
Naoya Takahashi
Yuichiro Koyama
Shusuke Takahashi
Emiru Tsunoo
Masafumi Takahashi
Yuki Mitsufuji
4
+ A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection. 2021 Archontis Politis
Sharath Adavanne
Daniel Krause
Antoine Deleforge
Prerak Srivastava
Tuomas Virtanen
4
+ PDF Chat SEGAN: Speech Enhancement Generative Adversarial Network 2017 Santiago Pascual
Antonio Bonafonte
Joan SerrĂ 
4
+ PDF Chat Densely connected multidilated convolutional networks for dense prediction tasks 2021 Naoya Takahashi
Yuki Mitsufuji
4
+ PDF Chat Fullsubnet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement 2021 Xiang Hao
Xiangdong Su
Radu Horaud
Xiaofei Li
3
+ Real-time Denoising and Dereverberation with Tiny Recurrent U-Net 2021 Hyeong-Seok Choi
Sungjin Park
Jie Hwan Lee
Hoon Heo
Dongsuk Jeon
Kyogu Lee
3
+ D3Net: Densely connected multidilated DenseNet for music source separation 2020 Naoya Takahashi
Yuki Mitsufuji
3
+ Denoising Diffusion Implicit Models 2020 Jiaming Song
Chenlin Meng
Stefano Ermon
3
+ Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection 2016 Naoya Takahashi
Michael Gygli
Beat Pfister
Luc Van Gool
3
+ DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection 2021 Thi Ngoc Tho Nguyen
Karn N. Watcharasupat
Ngoc Khanh Nguyen
Douglas L. Jones
Woon‐Seng Gan
3
+ Denoising Diffusion Probabilistic Models 2020 Jonathan Ho
Ajay N. Jain
Pieter Abbeel
3
+ MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement 2019 Szu‐Wei Fu
Chien-Feng Liao
Yu Tsao
Shou-De Lin
3
+ PDF Chat Asteroid: The PyTorch-Based Audio Source Separation Toolkit for Researchers 2020 Manuel Pariente
Samuele Cornell
Joris Cosentino
Sunit Sivasankaran
Efthymios Tzinis
Jens Heitkaemper
Michel Olvera
Fabian-Robert Stöter
Mathieu Hu
Juan M. Martín-Doñas
3
+ Emergency Vehicles Audio Detection and Localization in Autonomous Driving 2021 Hongyi Sun
Xinyi Liu
Kecheng Xu
Jinghao Miao
Qi Luo
3
+ PDF Chat Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation 2020 Yi Luo
Zhuo Chen
Takuya Yoshioka
3
+ PDF Chat Conditional Diffusion Probabilistic Model for Speech Enhancement 2022 Yen-Ju Lu
Zhong-Qiu Wang
Shinji Watanabe
Alexander Richard
Cheng Yu
Yu Tsao
3
+ Denoising Diffusion Restoration Models 2022 Bahjat Kawar
Michael Elad
Stefano Ermon
Jiaming Song
3
+ PDF Chat Deep attractor network for single-microphone speaker separation 2017 Zhuo Chen
Yi Luo
Nima Mesgarani
2
+ PDF Chat Restoring Degraded Speech via a Modified Diffusion Model 2021 Jianwei Zhang
Suren Jayasuriya
Visar Berisha
2
+ PDF Chat DNN-Based Source Enhancement to Increase Objective Sound Quality Assessment Score 2018 Yuma Koizumi
Kenta Niwa
Yusuke Hioka
Kazunori Kobayashi
Yoichi Haneda
2
+ PDF Chat Deep clustering and conventional networks for music separation: Stronger together 2017 Yi Luo
Zhuo Chen
John R. Hershey
Jonathan Le Roux
Nima Mesgarani
2
+ Improved Techniques for Training Score-Based Generative Models 2020 Yang Song
Stefano Ermon
2
+ PDF Chat PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition 2020 Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
2
+ PDF Chat Permutation invariant training of deep models for speaker-independent multi-talker speech separation 2017 Dong Yu
Morten KolbĂŠk
Zheng‐Hua Tan
Jesper Jensen
2
+ PDF Chat Pyroomacoustics: A Python Package for Audio Room Simulation and Array Processing Algorithms 2018 Robin Scheibler
Eric Bezzam
Ivan Dokmanić
2
+ Event-Independent Network for Polyphonic Sound Event Localization and Detection 2020 Yin Cao
Turab Iqbal
Qiuqiang Kong
Yue Zhong
Wenwu Wang
Mark D. Plumbley
2
+ Diffusion Models Beat GANs on Image Synthesis 2021 Prafulla Dhariwal
Alex Nichol
2
+ Speaker-Independent Speech Separation With Deep Attractor Network 2018 Yi Luo
Zhuo Chen
Nima Mesgarani
2
+ PDF Chat Single-Channel Multi-Speaker Separation Using Deep Clustering 2016 Yusuf Ziya IĆŸÄ±k
Jonathan Le Roux
Zhuo Chen
Shinji Watanabe
John R. Hershey
2
+ Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net 2020 Kazuki Shimada
Naoya Takahashi
Shusuke Takahashi
Yuki Mitsufuji
2