Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition

Type: Preprint

Publication Date: 2017-08-16

Citations: 73

DOI: https://doi.org/10.21437/interspeech.2017-556

Abstract

Layer normalization is a recently introduced technique for normalizing the activities of neurons in deep neural networks to improve the training speed and stability.In this paper, we introduce a new layer normalization technique called Dynamic Layer Normalization (DLN) for adaptive neural acoustic modeling in speech recognition.By dynamically generating the scaling and shifting parameters in layer normalization, DLN adapts neural acoustic models to the acoustic variability arising from various factors such as speakers, channel noises, and environments.Unlike other adaptive acoustic models, our proposed approach does not require additional adaptation data or speaker information such as i-vectors.Moreover, the model size is fixed as it dynamically generates adaptation parameters.We apply our proposed DLN to deep bidirectional LSTM acoustic models and evaluate them on two benchmark datasets for large vocabulary ASR experiments: WSJ and TED-LIUM release 2. The experimental results show that our DLN improves neural acoustic models in terms of transcription accuracy by dynamically adapting to various speakers and environments.

Locations

  • arXiv (Cornell University) - View - PDF
  • Interspeech 2022 - View

Similar Works

Action Title Year Authors
+ Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition 2017 Taesup Kim
Inchul Song
Yoshua Bengio
+ PDF Chat Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation 2016 Paweł Świętojański
Jinyu Li
Steve Renals
+ SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition 2019 Zhen Huang
Tim Ng
Leo Liu
H. Benjamin Mason
Xiaodan Zhuang
Daben Liu
+ PDF Chat SNDCNN: Self-Normalizing Deep CNNs with Scaled Exponential Linear Units for Speech Recognition 2020 Zhen Huang
Tim Ng
Leo Liu
H. Benjamin Mason
Xiaodan Zhuang
Daben Liu
+ Attentive batch normalization for lstm-based acoustic modeling of speech recognition 2020 Fenglin Ding
Wu Guo
Li-Rong Dai
Jun Du
+ Recent Progresses in Deep Learning based Acoustic Models (Updated) 2018 Dong Yu
Jinyu Li
+ Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models 2020 Natalia Tomashenko
Yuri Khokhlov
Yannick Estève
+ Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems 2018 Hieu-Thi Luong
Junichi Yamagishi
+ PDF Chat Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems 2018 Hieu-Thi Luong
Junichi Yamagishi
+ Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing 2022 Xiaodong Cui
George Saon
Tohru Nagano
Masayuki Suzuki
Takashi Fukuda
Brian Kingsbury
Gakuto Kurata
+ PDF Chat Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing 2022 Xiaodong Cui
George Saon
Tohru Nagano
Masayuki Suzuki
Takashi Fukuda
Brian Kingsbury
Gakuto Kurata
+ PDF Chat Empirical Evaluation of Speaker Adaptation on DNN Based Acoustic Model 2018 Ke Wang
Junbo Zhang
Yujun Wang
Lei Xie
+ A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures 2018 Jan Vaněk
Josef Michálek
Jan Zelinka
Josef Psutka
+ A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures 2018 Jan Vaněk
Josef Michálek
Ján Zelinka
Josef Psutka
+ Increasing Deep Neural Network Acoustic Model Size for Large Vocabulary Continuous Speech Recognition 2014 Andrew L. Maas
Awni Hannun
Christopher T. Lengerich
Peng Qi
Daniel Jurafsky
Andrew Y. Ng
+ PDF Chat Maximum a posteriori adaptation of network parameters in deep models 2015 Zhen Huang
Sabato Marco Siniscalchi
I‐Ming Chen
Jinyu Li
Jiadong Wu
Chin‐Hui Lee
+ Maximum a Posteriori Adaptation of Network Parameters in Deep Models 2015 Zhen Huang
Sabato Marco Siniscalchi
I‐Ming Chen
Jiadong Wu
Chin‐Hui Lee
+ PDF Chat Analyzing Deep CNN-Based Utterance Embeddings for Acoustic Model Adaptation 2018 Joanna Rownicka
Peter Bell
Steve Renals
+ Analyzing deep CNN-based utterance embeddings for acoustic model adaptation 2018 Joanna Rownicka
Peter Bell
Steve Renals
+ Analyzing deep CNN-based utterance embeddings for acoustic model adaptation 2018 J. Równicka
Peter Bell
Steve Renals