Ask a Question

Prefer a chat interface with context about you and your work?

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Single document news summarization has seen substantial progress on faithfulness in recent years, driven by research on the evaluation of factual consistency, or hallucinations. We ask whether these advances carry over to other text summarization domains. We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of …