Balancing Diversity and Risk in LLM Sampling: How to Select Your Method
and Parameter for Open-Ended Text Generation
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method
and Parameter for Open-Ended Text Generation
Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, which target a balance between diversity and quality via temperature tuning and tail truncation (e.g., top-k and top-p sampling). Considering the high dynamic range of the candidate next-token given different prefixes, recent studies propose to …