Ask a Question

Prefer a chat interface with context about you and your work?

Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems.

Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems.

Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each …