Ask a Question

Prefer a chat interface with context about you and your work?

Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence

Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence

This paper introduces Filtered Corpus Training, a method that trains language models (LMs) on corpora with certain linguistic constructions filtered out from the training data, and uses it to measure the ability of LMs to perform linguistic generalization on the basis of indirect evidence. We apply the method to both …