Frequency Effects on Syntactic Rule Learning in Transformers
Frequency Effects on Syntactic Rule Learning in Transformers
Pre-trained language models perform well on a variety of linguistic tasks that require symbolic reasoning, raising the question of whether such models implicitly represent abstract symbols and rules. We investigate this question using the case study of BERT's performance on English subject–verb agreement. Unlike prior work, we train multiple instances …