A Systematic Assessment of Syntactic Generalization in Neural Language Models

Type: Preprint

Publication Date: 2020-01-01

Citations: 128

DOI: https://doi.org/10.18653/v1/2020.acl-main.158

Download PDF

Abstract

While state-of-the-art neural network models continue to achieve lower perplexity scores on language modeling benchmarks, it remains unknown whether optimizing for broad-coverage predictive performance leads to human-like syntactic knowledge. Furthermore, existing work has not provided a clear picture about the model properties required to produce proper syntactic generalizations. We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites. We find substantial differences in syntactic generalization performance by model architecture, with sequential models underperforming other architectures. Factorially manipulating model architecture and training dataset size (1M-40M words), we find that variability in syntactic generalization performance is substantially greater by architecture than by dataset size for the corpora tested in our experiments. Our results also reveal a dissociation between perplexity and syntactic generalization performance.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ A Systematic Assessment of Syntactic Generalization in Neural Language Models 2020 Jennifer Hu
Jon Gauthier
Peng Qian
Ethan Wilcox
Roger Lévy
+ PDF Chat The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models 2024 Adithya Bhaskar
Dan Friedman
Danqi Chen
+ Overestimation of Syntactic Representation in Neural Language Models 2020 Jordan Kodner
Nitish Gupta
+ Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing 2022 Linlu Qiu
Peter Shaw
Panupong Pasupat
Tianze Shi
Jonathan Herzig
Emily Pitler
Fei Sha
Kristina Toutanova
+ How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases 2023 Aaron Mueller
Tal Linzen
+ How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases 2023 Aaron Mueller
Tal Linzen
+ PDF Chat Evaluating Structural Generalization in Neural Machine Translation 2024 Ryoma Kumon
Daiki Matsuoka
Hitomi Yanaka
+ Scalable Syntax-Aware Language Models Using Knowledge Distillation 2019 Adhiguna Kuncoro
Chris Dyer
Laura Rimell
Stephen R. L. Clark
Phil Blunsom
+ Scalable Syntax-Aware Language Models Using Knowledge Distillation 2019 Adhiguna Kuncoro
Chris Dyer
Laura Rimell
Stephen Clark
Phil Blunsom
+ PDF Chat Scalable Syntax-Aware Language Models Using Knowledge Distillation 2019 Adhiguna Kuncoro
Chris Dyer
Laura Rimell
Stephen Clark
Phil Blunsom
+ PDF Chat Syntactic Structure Distillation Pretraining for Bidirectional Encoders 2020 Adhiguna Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
+ Syntactic Structure Distillation Pretraining For Bidirectional Encoders 2020 Adhiguna Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
+ Syntactic Structure Distillation Pretraining For Bidirectional Encoders 2020 Adhiguna Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
+ Overestimation of Syntactic Representationin Neural Language Models 2020 Jordan Kodner
Nitish Gupta
+ Overestimation of Syntactic Representationin Neural Language Models 2020 Jordan Kodner
Nitish Gupta
+ Compositional generalization in a deep seq2seq model by separating syntax and semantics 2019 Jacob Russin
Jason Jo
Randall C. O’Reilly
Yoshua Bengio
+ COGS: A Compositional Generalization Challenge Based on Semantic Interpretation 2020 Najoung Kim
Tal Linzen
+ COGS: A Compositional Generalization Challenge Based on Semantic Interpretation 2020 Najoung Kim
Tal Linzen
+ PDF Chat SLOG: A Structural Generalization Benchmark for Semantic Parsing 2023 Bingzhi Li
Lucia Donatelli
Alexander Koller
Tal Linzen
Yuekun Yao
Najoung Kim
+ SLOG: A Structural Generalization Benchmark for Semantic Parsing 2023 Bingzhi Li
Lucia Donatelli
Alexander Koller
Tal Linzen
Yuekun Yao
Najoung Kim

Works That Cite This (70)

Action Title Year Authors
+ Language models align with human judgments on key grammatical constructions 2024 Jennifer Hu
Kyle Mahowald
Gary Lupyan
Anna A. Ivanova
Roger Lévy
+ PDF Chat When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it 2022 Sebastian Schuster
Tal Linzen
+ PDF Chat Single‐Stage Prediction Models Do Not Explain the Magnitude of Syntactic Disambiguation Difficulty 2021 Marten van Schijndel
Tal Linzen
+ PDF Chat Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling 2020 Yiding Hao
Simon Mendelsohn
Rachel Sterneck
Randi Martinez
Robert Frank
+ Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics 2023 Yuhan Zhang
Edward Gibson
Forrest Davis
+ PDF Chat Meaning creation in novel noun-noun compounds: humans and language models 2023 Phoebe Chen
David Poeppel
Arianna Zuanazzi
+ PDF Chat Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality 2022 Tristan Thrush
Ryan Jiang
Max Bartolo
Amanpreet Singh
Adina Williams
Douwe Kiela
Candace Ross
+ Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually) 2020 Alex Warstadt
Yian Zhang
Xiaocheng Li
Haokun Liu
Samuel R. Bowman
+ PDF Chat Neural reality of argument structure constructions 2022 Bai Li
Zining Zhu
Guillaume F. Thomas
Frank Rudzicz
Yang Xu
+ Meaning creation in novel noun-noun compounds: humans and language models 2023 Phoebe Chen
David Poeppel
Arianna Zuanazzi