Ask a Question

Prefer a chat interface with context about you and your work?

Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

Do language models have beliefs about the world? Dennett (1995) famously argues that even thermostats have beliefs, on the view that a belief is simply an informational state decoupled from any motivational state. In this paper, we discuss approaches to detecting when models have beliefs about the world, and we …