Ask a Question

Prefer a chat interface with context about you and your work?

Hierarchical Reinforcement Learning for Open-Domain Dialog

Hierarchical Reinforcement Learning for Open-Domain Dialog

Open-domain dialog generation is a challenging problem; maximum likelihood training can lead to repetitive outputs, models have difficulty tracking long-term conversational goals, and training on standard movie or online datasets may lead to the generation of inappropriate, biased, or offensive text. Reinforcement Learning (RL) is a powerful framework that could …