Early Stage Sparse Retrieval with Entity Linking

Type: Article

Publication Date: 2022-10-15

Citations: 7

DOI: https://doi.org/10.1145/3511808.3557588

Abstract

Despite the advantages of their low-resource settings, traditional sparse retrievers depend on exact matching approaches between high-dimensional bag-of-words (BoW) representations of both the queries and the collection. As a result, retrieval performance is restricted by semantic discrepancies and vocabulary gaps. On the other hand, transformer-based dense retrievers introduce significant improvements in information retrieval tasks by exploiting low-dimensional contextualized representations of the corpus. While dense retrievers are known for their relative effectiveness, they suffer from lower efficiency and lack of generalization issues, when compared to sparse retrievers. For a lightweight retrieval task, high computational resources and time consumption are major barriers encouraging the renunciation of dense models despite potential gains. In this work, we propose boosting the performance of sparse retrievers by expanding both the queries and the documents with linked entities in two formats for the entity names: 1) explicit and 2) hashed. We employ a zero-shot end-to-end dense entity linking system for entity recognition and disambiguation to augment the corpus. By leveraging the advanced entity linking methods, we believe that the effectiveness gap between sparse and dense retrievers can be narrowed. We conduct our experiments on the MS MARCO passage dataset. Since we are concerned with the early stage retrieval in cascaded ranking architectures of large information retrieval systems, we evaluate our results using [email protected] Our approach is also capable of retrieving documents for query subsets judged to be particularly difficult in prior work. We further demonstrate that the non-expanded and the expanded runs with both explicit and hashed entities retrieve complementary results. Consequently, we adopt a run fusion approach to maximize the benefits of entity linking.

Locations

  • arXiv (Cornell University) - View - PDF
  • Proceedings of the 31st ACM International Conference on Information & Knowledge Management - View

Similar Works

Action Title Year Authors
+ Early Stage Sparse Retrieval with Entity Linking 2022 Dahlia Shehata
Negar Arabzadeh
Charles L. A. Clarke
+ PDF Chat Information Retrieval with Entity Linking 2024 Dahlia Shehata
+ Improving Zero-Shot Entity Retrieval through Effective Dense Representations 2021 Eleni Partalidou
Despina Christou
Grigorios Tsoumakas
+ PDF Chat Revisiting Sparse Retrieval for Few-shot Entity Linking 2023 Yulin Chen
Zhenran Xu
Baotian Hu
Min Zhang
+ Revisiting Sparse Retrieval for Few-shot Entity Linking 2023 Yulin Chen
Zhenran Xu
Baotian Hu
Min Zhang
+ PDF Chat Improving Zero-Shot Entity Retrieval through Effective Dense Representations 2022 Eleni Partalidou
Despina Christou
Grigorios Tsoumakas
+ Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One? 2021 Xilun Chen
Kushal Lakhotia
Barlas Oğuz
Anchit Gupta
Patrick Lewis
Stan Peshterliev
Yashar Mehdad
Sonal Gupta
Wen-tau Yih
+ Scalable Zero-shot Entity Linking with Dense Entity Retrieval 2019 Ledell Wu
Fabio Petroni
Martin Josifoski
Sebastian Riedel
Luke Zettlemoyer
+ Scalable Zero-shot Entity Linking with Dense Entity Retrieval 2020 Ledell Wu
Fabio Petroni
Martin Josifoski
Sebastian Riedel
Luke Zettlemoyer
+ Building Interpretable and Reliable Open Information Retriever for New Domains Overnight 2023 Xiaodong Yu
Ben Zhou
Dan Roth
+ PDF Chat Query Embedding Pruning for Dense Retrieval 2021 Nicola Tonellotto
Craig Macdonald
+ Early Fusion Strategy for Entity-Relationship Retrieval 2017 Pedro Saleiro
Nataša Milić-Frayling
Eduarda Mendes Rodrigues
Carlos Soares
+ Autoregressive Entity Retrieval 2020 Nicola De Cao
Gautier Izacard
Sebastian Riedel
Fabio Petroni
+ PDF Chat Entity Retrieval for Answering Entity-Centric Questions 2024 Hassan S. Shavarani
Anoop Sarkar
+ PDF Chat Towards Unsupervised Dense Information Retrieval with Contrastive Learning 2021 Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Édouard Grave
+ Learning Dense Representations for Entity Retrieval 2019 Daniel Gillick
Sayali Kulkarni
Larry Lansing
Alessandro Presta
Jason Baldridge
Eugene Ie
Diego García-Olano
+ Fast Passage Re-ranking with Contextualized Exact Term Matching and Efficient Passage Expansion 2021 Shengyao Zhuang
Guido Zuccon
+ Learning Dense Representations for Entity Retrieval. 2019 Daniel Gillick
Sayali Kulkarni
Larry Lansing
Alessandro Presta
Jason Baldridge
Eugene Ie
Diego García-Olano
+ PDF Chat Multilingual Entity Linking Using Dense Retrieval 2024 Dominik Farhan
+ Improving Few-shot and Zero-shot Entity Linking with Coarse-to-Fine Lexicon-based Retriever 2023 Shijue Huang
Bingbing Wang
Libo Qin
Qin Zhao
Ruifeng Xu