Ask a Question

Prefer a chat interface with context about you and your work?

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

The deep Q-network (DQN) and return-based reinforcement learning are two promising algorithms proposed in recent years. DQN brings advances to complex sequential decision problems, while return-based algorithms have advantages in making use of sample trajectories. In this paper, we propose a general framework to combine DQN and most of the …