Ask a Question

Prefer a chat interface with context about you and your work?

Learning the Target Network in Function Space

Learning the Target Network in Function Space

We focus on the task of learning the value function in the reinforcement learning (RL) setting. This task is often solved by updating a pair of online and target networks while ensuring that the parameters of these two networks are equivalent. We propose Lookahead-Replicate (LR), a new value-function approximation algorithm …