当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Scaling Laws For Dense Retrieval
arXiv - CS - Information Retrieval Pub Date : 2024-03-27 , DOI: arxiv-2403.18684
Yan Fang, Jingtao Zhan, Qingyao Ai, Jiaxin Mao, Weihang Su, Jia Chen, Yiqun Liu

Scaling up neural models has yielded significant advancements in a wide array of tasks, particularly in language generation. Previous studies have found that the performance of neural models frequently adheres to predictable scaling laws, correlated with factors such as training set size and model size. This insight is invaluable, especially as large-scale experiments grow increasingly resource-intensive. Yet, such scaling law has not been fully explored in dense retrieval due to the discrete nature of retrieval metrics and complex relationships between training data and model sizes in retrieval tasks. In this study, we investigate whether the performance of dense retrieval models follows the scaling law as other neural models. We propose to use contrastive log-likelihood as the evaluation metric and conduct extensive experiments with dense retrieval models implemented with different numbers of parameters and trained with different amounts of annotated data. Results indicate that, under our settings, the performance of dense retrieval models follows a precise power-law scaling related to the model size and the number of annotations. Additionally, we examine scaling with prevalent data augmentation methods to assess the impact of annotation quality, and apply the scaling law to find the best resource allocation strategy under a budget constraint. We believe that these insights will significantly contribute to understanding the scaling effect of dense retrieval models and offer meaningful guidance for future research endeavors.

中文翻译:

密集检索的缩放法则

扩大神经模型在许多任务中取得了显着的进步,特别是在语言生成方面。先前的研究发现,神经模型的性能经常遵循可预测的缩放定律,与训练集大小和模型大小等因素相关。这种洞察力是无价的,尤其是在大规模实验变得越来越资源密集的情况下。然而,由于检索指标的离散性以及检索任务中训练数据和模型大小之间的复杂关系,这种缩放法则尚未在密集检索中得到充分探索。在本研究中,我们研究密集检索模型的性能是否像其他神经模型一样遵循缩放定律。我们建议使用对比对数似然作为评估指标,并对使用不同数量的参数实现并使用不同数量的注释数据进行训练的密集检索模型进行广泛的实验。结果表明,在我们的设置下,密集检索模型的性能遵循与模型大小和注释数量相关的精确幂律缩放。此外,我们还使用流行的数据增强方法来检查缩放比例,以评估注释质量的影响,并应用缩放法则来找到预算约束下的最佳资源分配策略。我们相信这些见解将极大地有助于理解密集检索模型的缩放效应,并为未来的研究工作提供有意义的指导。
更新日期:2024-03-28
down
wechat
bug