Byzantine-Resilient Decentralized TD Learning with Linear Function Approximation
Byzantine-Resilient Decentralized TD Learning with Linear Function Approximation
This paper considers the policy evaluation problem in reinforcement learning with agents of a decentralized and directed network. The focus is on decentralized temporal-difference (TD) learning with linear function approximation in the presence of unreliable or even malicious agents, termed as Byzantine agents. In order to evaluate the quality of …