Learning from Committee: Reasoning Distillation from a Mixture of
Teachers with Peer-Review
Learning from Committee: Reasoning Distillation from a Mixture of
Teachers with Peer-Review
While reasoning capabilities typically emerge in large language models (LLMs) with tens of billions of parameters, recent research focuses on improving smaller open-source models through knowledge distillation (KD) from commercial LLMs. However, many of these studies rely solely on responses from a single LLM as the gold rationale, unlike the …