What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding
What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding
In recent years, pre-trained Transformers have dominated the majority of NLP benchmark tasks. Many variants of pre-trained Transformers have kept breaking out, and most focus on designing different pre-training objectives or variants of self-attention. Embedding the position information in the self-attention mechanism is also an indispensable factor in Transformers however …