Efficient query processing techniques for next-page retrieval,Information Retrieval Journal

当前位置： X-MOL 学术 › Inf. Retrieval J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Efficient query processing techniques for next-page retrieval
Information Retrieval Journal ( IF 2.5 ) Pub Date : 2022-01-18 , DOI: 10.1007/s10791-021-09402-7
Joel Mackenzie ₁ , Alistair Moffat ₁ , Matthias Petri ₂

Affiliation

In top-k ranked retrieval the goal is to efficiently compute an ordered list of the highest scoring k documents according to some stipulated similarity function such as the well-known BM25 approach. In most implementation techniques a min-heap of size k is used to track the top scoring candidates. In this work we consider the question of how best to retrieve the second page of search results, given that a first page has already been computed; that is, identification of the documents at ranks \(k+1\) to 2k for some query. Our goal is to understand what information is available as a by-product of the first-page scoring, and how it can be employed to accelerate the second-page computation, assuming that the second-page of results is required for only a fraction of the query load. We propose a range of simple, yet efficient, next-page retrieval techniques which are suitable for accelerating Document-at-a-Time mechanisms, and demonstrate their performance on three large text collections.

中文翻译：

用于下一页检索的高效查询处理技术

在排名前k的检索中，目标是根据一些规定的相似性函数（例如众所周知的 BM25 方法）有效地计算得分最高的k个文档的有序列表。在大多数实现技术中，使用大小为k的最小堆来跟踪得分最高的候选者。在这项工作中，我们考虑了如何最好地检索搜索结果的第二页的问题，因为已经计算了第一页；即，识别第\(k+1\)到 2 k的文档一些查询。我们的目标是了解哪些信息可作为第一页评分的副产品，以及如何使用它来加速第二页计算，假设第二页结果只需要一小部分查询负载。我们提出了一系列简单但有效的下一页检索技术，这些技术适用于加速 Document-at-a-Time 机制，并展示了它们在三个大型文本集合上的性能。

更新日期：2022-01-18

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>