Ask a Question

Prefer a chat interface with context about you and your work?

ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models

ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models

Feature attribution methods (FAs), such as gradients and attention, are widely employed approaches to derive the importance of all input features to the model predictions. Existing work in natural language processing has mostly focused on developing and testing FAs for encoder-only language models (LMs) in classification tasks. However, it is …