A Fast, Compact, Accurate Model for Language Identification of Codemixed Text
A Fast, Compact, Accurate Model for Language Identification of Codemixed Text
We address fine-grained multilingual language identification: providing a language code for every token in a sentence, including codemixed text containing multiple languages. Such text is prevalent online, in documents, social media, and message boards. We show that a feed-forward network with a simple globally constrained decoder can accurately and rapidly …