Addressing Action Oscillations through Learning Policy Inertia
Addressing Action Oscillations through Learning Policy Inertia
Deep reinforcement learning (DRL) algorithms have been demonstrated to be effective in a wide range of challenging decision making and control tasks. However, these methods typically suffer from severe action oscillations in particular in discrete action setting, which means that agents select different actions within consecutive steps even though states …