BLiMP: The Benchmark of Linguistic Minimal Pairs for English
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
We introduce The Benchmark of Linguistic Minimal Pairs (BLiMP), 1 a challenge set for evaluating the linguistic knowledge of language models (LMs) on major grammatical phenomena in English. BLiMP consists of 67 individual datasets, each containing 1,000 minimal pairs—that is, pairs of minimally different sentences that contrast in grammatical acceptability …