Ask a Question

Prefer a chat interface with context about you and your work?

Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review

Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review

While reasoning capabilities typically emerge in large language models (LLMs) with tens of billions of parameters, recent research focuses on improving smaller open-source models through knowledge distillation (KD) from commercial LLMs. However, many of these studies rely solely on responses from a single LLM as the gold rationale, unlike the …