Ask AI a math question

Related Paper

Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark

Abstract Knowledge-grounded dialogue systems powered by large language models often generate responses that, while fluent, are not attributable to a relevant source of information. Progress towards models that do not exhibit this issue requires evaluation metrics that can quantify its prevalence. To this end, we introduce the Benchmark for Evaluation …

Ask a Question