Ask a Question

Prefer a chat interface with context about you and your work?

MDPGT: Momentum-Based Decentralized Policy Gradient Tracking

MDPGT: Momentum-Based Decentralized Policy Gradient Tracking

We propose a novel policy gradient method for multi-agent reinforcement learning, which leverages two different variance-reduction techniques and does not require large batches over iterations. Specifically, we propose a momentum-based decentralized policy gradient tracking (MDPGT) where a new momentum-based variance reduction technique is used to approximate the local policy gradient …