Ask a Question

Prefer a chat interface with context about you and your work?

Lipreading with long short-term memory

Lipreading with long short-term memory

Lipreading, i.e. speech recognition from visual-only recordings of a speaker's face, can be achieved with a processing pipeline based solely on neural networks, yielding significantly better accuracy than conventional methods. Feedforward and recurrent neural network layers (namely Long Short-Term Memory; LSTM) are stacked to form a single structure which is …