GROD: Enhancing Generalization of Transformer with Out-of-Distribution
Detection
GROD: Enhancing Generalization of Transformer with Out-of-Distribution
Detection
Transformer networks excel in natural language processing (NLP) and computer vision (CV) tasks. However, they face challenges in generalizing to Out-of-Distribution (OOD) datasets, that is, data whose distribution differs from that seen during training. The OOD detection aims to distinguish data that deviates from the expected distribution, while maintaining optimal …