Ask a Question

Prefer a chat interface with context about you and your work?

Towards More Robust Interpretation via Local Gradient Alignment

Towards More Robust Interpretation via Local Gradient Alignment

Neural network interpretation methods, particularly feature attribution methods, are known to be fragile with respect to adversarial input perturbations. To address this, several methods for enhancing the local smoothness of the gradient while training have been proposed for attaining robust feature attributions. However, the lack of considering the normalization of …