Deep convolutional acoustic word embeddings using word-pair side information
Deep convolutional acoustic word embeddings using word-pair side information
Recent studies have been revisiting whole words as the basic modelling unit in speech recognition and query applications, instead of phonetic units. Such whole-word segmental systems rely on a function that maps a variable-length speech segment to a vector in a fixed-dimensional space; the resulting acoustic word embeddings need to …