vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of
Gradient Directions for Policy Improvement
vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of
Gradient Directions for Policy Improvement
Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor …