Byzantine-Resilient Decentralized Policy Evaluation With Linear Function Approximation
Byzantine-Resilient Decentralized Policy Evaluation With Linear Function Approximation
This paper considers the policy evaluation problem in a multi-agent reinforcement learning (MARL) environment over decentralized and directed networks. The focus is on decentralized temporal difference (TD) learning with linear function approximation in the presence of unreliable or even malicious agents, termed as Byzantine agents. In order to evaluate the …