Mastering Board Games by External and Internal Planning with Language Models

John Schultz, Jakub Adamek, Matej Jusup, Marc Lanctot, Michael Kaisers, Sarah Perrin, Daniel Hennes, Jeremy Shar, Chad Y. Lewis, Anian Ruoss

Type: Preprint

Publication Date: 2024-12-02

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2412.12119

View Publication

Download PDF

Abstract

While large language models perform well on a range of complex tasks (e.g., text generation, question answering, summarization), robust multi-step planning and reasoning remains a considerable challenge for them. In this paper we show that search-based planning can significantly improve LLMs' playing strength across several board games (Chess, Fischer Random / Chess960, Connect Four, and Hex). We introduce, compare and contrast two major approaches: In external search, the model guides Monte Carlo Tree Search (MCTS) rollouts and evaluations without calls to an external engine, and in internal search, the model directly generates in-context a linearized tree of potential futures and a resulting final choice. Both build on a language model pre-trained on relevant domain knowledge, capturing the transition and value functions across these games. We find that our pre-training method minimizes hallucinations, as our model is highly accurate regarding state prediction and legal moves. Additionally, both internal and external search indeed improve win-rates against state-of-the-art bots, even reaching Grandmaster-level performance in chess while operating on a similar move count search budget per decision as human Grandmasters. The way we combine search with domain knowledge is not specific to board games, suggesting direct extensions into more general language model inference and training techniques.

Locations

arXiv (Cornell University) - View - PDF

Similar Works

Action	Title	Year	Authors
+ PDF Chat	Can Large Language Models Play Games? A Case Study of A Self-Play Approach	2024	Hongyi Guo Zhihan Liu Yufeng Zhang Zhaoran Wang
+ PDF Chat	Non-myopic Generation of Language Models for Reasoning and Planning	2024	Chang Ma Haiteng Zhao Junlei Zhang Junxian He Lingpeng Kong
+ PDF Chat	GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps	2024	Muhammad Nasir Steven R. James Julian Togelius
+ PDF Chat	GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents	2024	Anthony Costarelli Mat Allen Roman Hauksson Grace Sodunke Suhas Hariharan Carlson Cheng Wenjie Li Arjun Yadav
+	Autonomous Tree-search Ability of Large Language Models	2023	Zheyu Zhang Zhuorui Ye Yikang Shen Chuang Gan
+ PDF Chat	Can Language Models Serve as Text-Based World Simulators?	2024	Ruoyao Wang Graham Todd Ziang Xiao Xingdi Yuan Marc-Alexandre Côté Peter E. Clark Peter Jansen
+ PDF Chat	Reasoning with Language Model is Planning with World Model	2023	Shibo Hao Yi Gu Haodi Ma Joshua Hong Zhen Wang Daisy Wang Zhiting Hu
+ PDF Chat	Deliberate Reasoning for LLMs as Structure-aware Planning with Accurate World Model	2024	Siheng Xiong Ali Payani Yuan Yang Faramarz Fekri
+ PDF Chat	PIANIST: Learning Partially Observable World Models with LLMs for Multi-Agent Decision Making	2024	Jonathan Light Sixue Xing Yuanzhe Liu Weiqin Chen Cai Min Xiusi Chen Guanzhi Wang Wei Cheng Yisong Yue Ziniu Hu
+	Reasoning with Language Model is Planning with World Model	2023	Shibo Hao Yi Gu Haodi Ma Joshua Jiahua Hong Zhen Wang Daisy Zhe Wang Zhiting Hu
+ PDF Chat	Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming	2024	Yilun Hao Yang Zhang Chuchu Fan
+ PDF Chat	PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset	2024	Arda Uzunoğlu Abdalfatah Rashid Safa Gözde Gül Şahin
+ PDF Chat	Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation	2024	Sukai Huang Trevor Cohn Nir Lipovetzky
+	ChessGPT: Bridging Policy Learning and Language Modeling	2023	Xidong Feng Yicheng Luo Ziyan Wang Hongrui Tang Mengyue Yang Kun Shao David Mguni Yali Du Jun Wang
+	Keep CALM and Explore: Language Models for Action Generation in Text-based Games	2020	Shunyu Yao Rohan Rao Matthew Hausknecht Karthik Narasimhan
+	Keep CALM and Explore: Language Models for Action Generation in Text-based Games	2020	Shunyu Yao Rohan Rao Matthew Hausknecht Karthik Narasimhan
+ PDF Chat	CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks	2024	Tianlong Wang Junzhe Chen Xueting Han Jing Bai
+	On the Planning Abilities of Large Language Models : A Critical Investigation	2023	Karthik Valmeekam Matthew Marquez Sarath Sreedharan Subbarao Kambhampati
+	PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization	2023	Xinyuan Wang Chenxi Li Zhen Wang Fan Bai Haotian Luo Jiayou Zhang Nebojša Jojić Eric P. Xing Zhiting Hu
+	Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex	2022	Charles Lovering Jessica Zosa Forde George Konidaris Ellie Pavlick Michael L. Littman

Works That Cite This (0)

Action	Title	Year	Authors

Works Cited by This (0)

Action	Title	Year	Authors