Fourier Position Embedding: Enhancing Attention's Periodic Extension for
Length Generalization
Fourier Position Embedding: Enhancing Attention's Periodic Extension for
Length Generalization
Extending the context length of Language Models (LMs) by improving Rotary Position Embedding (RoPE) has become a trend. While existing works mainly address RoPE's limitations within attention mechanism, this paper provides an analysis across nearly all parts of LMs, uncovering their adverse effects on length generalization for RoPE-based attention. Using …