Ask a Question

Prefer a chat interface with context about you and your work?

Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies

Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies

Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish during training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by …