Ask a Question

Prefer a chat interface with context about you and your work?

On the Reliability of Test Collections for Evaluating Systems of Different Types

On the Reliability of Test Collections for Evaluating Systems of Different Types

As deep learning based models are increasingly being used for information retrieval, a major challenge is to ensure the availability of test collections for measuring their quality. Test collections are usually generated based on pooling results of various retrieval systems, but until recently this did not include deep learning systems. …