Synthesizing Programmatic Reinforcement Learning Policies with Large
Language Model Guided Search
Synthesizing Programmatic Reinforcement Learning Policies with Large
Language Model Guided Search
Programmatic reinforcement learning (PRL) has been explored for representing policies through programs as a means to achieve interpretability and generalization. Despite promising outcomes, current state-of-the-art PRL methods are hindered by sample inefficiency, necessitating tens of millions of program-environment interactions. To tackle this challenge, we introduce a novel LLM-guided search framework …