Character-Level Language Modeling with Deeper Self-Attention

Type: Preprint

Publication Date: 2018-01-01

Citations: 4

DOI: https://doi.org/10.48550/arxiv.1808.04444

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Character-Level Language Modeling with Deeper Self-Attention 2018 Rami Al‐Rfou
Dokook Choe
Noah Constant
Mandy Guo
Llion Jones
+ PDF Chat Character-Level Language Modeling with Deeper Self-Attention 2019 Rami Al‐Rfou
Dokook Choe
Noah Constant
Mandy Guo
Llion Jones
+ Word-Level Representation From Bytes For Language Modeling 2022 Chu-Tak Lee
Qipeng Guo
Xipeng Qiu
+ Learning to Look Inside: Augmenting Token-Based Encoders with Character-Level Information 2021 Yuval Pinter
Amanda Stent
Mark Dredze
Jacob Eisenstein
+ Mogrifier LSTM 2019 Gábor Melis
Tomáš Kočiský
Phil Blunsom
+ Mogrifier LSTM. 2019 Gábor Melis
Tomáš Kočiský
Phil Blunsom
+ Bridging the Gap for Tokenizer-Free Language Models 2019 Dokook Choe
Rami Al‐Rfou
Mandy Guo
Heeyoung Lee
Noah Constant
+ Adaptive Attention Span in Transformers 2019 Sainbayar Sukhbaatar
Édouard Grave
Piotr Bojanowski
Armand Joulin
+ PDF Chat When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute 2021 Tao Leí
+ When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute 2021 Tao Lei
+ When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute 2021 Tao Leí
+ Transformer-XL: Attentive Language Models beyond a Fixed-Length Context 2019 Zihang Dai
Zhilin Yang
Yiming Yang
Jaime Carbonell
Quoc V. Le
Ruslan Salakhutdinov
+ Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context 2019 Zihang Dai
Zhilin Yang
Yiming Yang
Jaime Carbonell
Quoc V. Le
Ruslan Salakhutdinov
+ MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling 2022 Nathan Godey
Roman Castagné
Éric Villemonte de la Clergerie
Benoît Sagot
+ What is the best recipe for character-level encoder-only modelling? 2023 Kris Cao
+ PDF Chat Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens 2022 Itay Itzhak
Omer Levy
+ Models In a Spelling Bee: Language Models Implicitly Learn the Character Composition of Tokens 2021 Itay Itzhak
Omer Levy
+ Augmenting Self-attention with Persistent Memory 2019 Sainbayar Sukhbaatar
Édouard Grave
Guillaume Lample
Hervé Jeǵou
Armand Joulin
+ Augmenting Self-attention with Persistent Memory 2019 Sainbayar Sukhbaatar
Édouard Grave
Guillaume Lample
Hervé Jeǵou
Armand Joulin
+ Learn Your Tokens: Word-Pooled Tokenization for Language Modeling 2023 Avijit Thawani
Saurabh Ghanekar
Xiaoyuan Zhu
Jay Pujara

Works Cited by This (0)

Action Title Year Authors