SLOG: A Structural Generalization Benchmark for Semantic Parsing
SLOG: A Structural Generalization Benchmark for Semantic Parsing
The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions. Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training; structural generalization tasks, where a model needs to interpret syntactic structures that are …