当前位置: X-MOL 学术Inf. Retrieval J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient query processing techniques for next-page retrieval
Information Retrieval Journal ( IF 2.5 ) Pub Date : 2022-01-18 , DOI: 10.1007/s10791-021-09402-7
Joel Mackenzie 1 , Alistair Moffat 1 , Matthias Petri 2
Affiliation  

In top-k ranked retrieval the goal is to efficiently compute an ordered list of the highest scoring k documents according to some stipulated similarity function such as the well-known BM25 approach. In most implementation techniques a min-heap of size k is used to track the top scoring candidates. In this work we consider the question of how best to retrieve the second page of search results, given that a first page has already been computed; that is, identification of the documents at ranks \(k+1\) to 2k for some query. Our goal is to understand what information is available as a by-product of the first-page scoring, and how it can be employed to accelerate the second-page computation, assuming that the second-page of results is required for only a fraction of the query load. We propose a range of simple, yet efficient, next-page retrieval techniques which are suitable for accelerating Document-at-a-Time mechanisms, and demonstrate their performance on three large text collections.



中文翻译:

用于下一页检索的高效查询处理技术

在排名前k的检索中,目标是根据一些规定的相似性函数(例如众所周知的 BM25 方法)有效地计算得分最高的k个文档的有序列表。在大多数实现技术中,使用大小为k的最小堆来跟踪得分最高的候选者。在这项工作中,我们考虑了如何最好地检索搜索结果的第二页的问题,因为已经计算了第一页;即,识别第\(k+1\)到 2 k的文档一些查询。我们的目标是了解哪些信息可作为第一页评分的副产品,以及如何使用它来加速第二页计算,假设第二页结果只需要一小部分查询负载。我们提出了一系列简单但有效的下一页检索技术,这些技术适用于加速 Document-at-a-Time 机制,并展示了它们在三个大型文本集合上的性能。

更新日期:2022-01-18
down
wechat
bug