当前位置: X-MOL 学术ACM Trans. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems
ACM Transactions on Information Systems ( IF 5.6 ) Pub Date : 2024-02-09 , DOI: 10.1145/3637869
Zhengbang Zhu 1 , Rongjun Qin 2 , Junjie Huang 1 , Xinyi Dai 1 , Yang Yu† 2 , Yong Yu 1 , Weinan Zhang† 1
Affiliation  

Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this article, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics, Manipulation Score and Preference Shift. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.



中文翻译:

理解或操纵:重新思考现代推荐系统的在线性能增益

推荐系统有望成为帮助人类用户自动找到相关信息而无需显式查询的助手。随着推荐系统的发展,越来越复杂的学习技术得到应用,并在点击和浏览时间等用户参与度指标方面取得了更好的性能。然而,测量性能的提高可能有两个可能的原因:更好地了解用户偏好,以及更主动地利用人类有限理性来吸引用户过度消费。接下来的一个自然问题是当前的推荐算法是否正在操纵用户偏好。如果是这样,我们可以衡量操纵水平吗?在本文中,我们提出了一个通用框架,用于在板岩推荐和顺序推荐场景中对推荐算法的操作程度进行基准测试。该框架由四个阶段组成:初始偏好计算、训练数据收集、算法训练和交互以及指标计算,其中涉及两个建议指标:操纵分数和偏好转变。我们在所提出的框架下对合成和现实数据集中的一些代表性推荐算法进行了基准测试。我们观察到,高的在线点击率并不一定意味着更好地了解用户最初的偏好,而是最终促使用户选择更多他们最初不喜欢的文档。此外,我们发现训练数据对操纵程度有显着影响,建模能力越强的算法对这种影响越敏感。实验还验证了所提出的衡量操纵程度的指标的有用性。我们主张未来的推荐算法研究应被视为具有约束用户偏好操作的优化问题。

更新日期:2024-02-14
down
wechat
bug