Ask a Question

Prefer a chat interface with context about you and your work?

Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach

Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach

In Deep Reinforcement Learning models trained using gradient-based techniques, the choice of optimizer and its learning rate are crucial to achieving good performance: higher learning rates can prevent the model from learning effectively, while lower ones might slow convergence. Additionally, due to the non-stationarity of the objective function, the best-performing …