Ask a Question

Prefer a chat interface with context about you and your work?

End-to-End Audiovisual Speech Recognition

End-to-End Audiovisual Speech Recognition

Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-to-end audiovisual model based on residual networks and …