Rethinking Evaluation in ASR: Are Our Models Robust Enough?
Rethinking Evaluation in ASR: Are Our Models Robust Enough?
Is pushing numbers on a single benchmark valuable in automatic speech recognition?Research results in acoustic modeling are typically evaluated based on performance on a single dataset.While the research community has coalesced around various benchmarks, we set out to understand generalization performance in acoustic modeling across datasets -in particular, if models …