Ask AI a math question

This paper proposes an end-to-end approach for single-channel speaker-independent multi-speaker speech separation, where time-frequency (T-F) masking, the short-time Fourier transform (STFT), and its inverse are represented as layers within a deep network.Previous approaches, rather than computing a loss on the reconstructed signal, used a surrogate loss based on the target …

Ask a Question