When Waiting Is Not an Option: Learning Options With a Deliberation Cost
When Waiting Is Not an Option: Learning Options With a Deliberation Cost
Recent work has shown that temporally extended actions (options) can be learned fully end-to-end as opposed to being specified in advance. While the problem of how to learn options is increasingly well understood, the question of what good options should be has remained elusive. We formulate our answer to what …