Ask a Question

Prefer a chat interface with context about you and your work?

Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems

Do all instances need inference through the big models for a correct prediction? Perhaps not; some instances are easy and can be answered correctly by even small capacity models. This provides opportunities for improving the computational efficiency of systems. In this work, we present an explorative study on 'model cascading', …