当前位置: X-MOL 学术Government Information Quarterly › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Stop trying to predict elections only with twitter – There are other data sources and technical issues to be improved
Government Information Quarterly ( IF 8.490 ) Pub Date : 2023-12-23 , DOI: 10.1016/j.giq.2023.101899
Kellyton Brito , Rogério Luiz Cardoso Silva Filho , Paulo Jorge Leitão Adeodato

Since the popularization of social media (SM) platforms, researchers have been trying to use their data to predict electoral results. Previous surveys point out that the most used approach is based on volume and sentiment analysis of posts on Twitter. However, they are almost unanimous in presenting that the results are not better than chance. In this context, this study aims to investigate the feasibility of predicting electoral results based only on Twitter, discover the main issues, and draw guidelines for future alternative directions. For this, we reviewed the evolution of election polling and predictions, including the “polling crises” of 1936 and 1948, and their similarities with current approaches. We also built on the official SM platforms' documentation and on our experience collecting and analyzing large-scale data from many SM platforms. Lastly, we analyzed nine reviews on predicting elections with SM data from 2013 to 2021. We observed that, contrary to initial expectations, most of the current research with Twitter has been unable to solve many of the challenges encountered since initial studies, and also shares many of the characteristics of unsuccessful straw polls performed before 1936. We illustrate that by highlighting the impracticability of polling over Twitter due to several biases and technical barriers, the need for external data, the high dependency on the arbitrary decisions of researchers, and the constant change in platforms' scenarios, that may invalidate specific models. Lastly, we indicate some of the possible future directions, such as a focus on creating repeatable processes; the use of SM data as part of statistical models, instead of polling; diversifying the input data sources, including multiple SM platforms and non-SM data such as polls and economic indicators; using machine learning for regression of the vote share, rather than for sentiment analysis; and dealing with the uncertainty of the highly divergent polling results.

中文翻译:

不要再试图仅通过推特来预测选举——还有其他数据源和技术问题需要改进

自从社交媒体(SM)平台普及以来,研究人员一直试图利用他们的数据来预测选举结果。之前的调查指出,最常用的方法是基于 Twitter 上帖子的数量和情绪分析。然而,他们几乎一致认为结果并不比偶然更好。在此背景下,本研究旨在探讨仅基于Twitter预测选举结果的可行性,发现主要问题,并为未来的替代方向提供指导。为此,我们回顾了选举民意调查和预测的演变,包括 1936 年和 1948 年的“民意调查危机”,以及它们与当前方法的相似之处。我们还以官方 SM 平台文档以及从许多 SM 平台收集和分析大规模数据的经验为基础。最后,我们分析了 2013 年至 2021 年利用 SM 数据预测选举的 9 篇评论。我们观察到,与最初的预期相反,目前 Twitter 的大部分研究都无法解决自最初研究以来遇到的许多挑战,并分享了1936 年之前进行的不成功的稻草民意调查的许多特征。我们通过强调通过 Twitter 进行民意调查的不切实际性来说明这一点,这是由于一些偏见和技术障碍、对外部数据的需要、对研究人员任意决策的高度依赖以及不断变化的结果。平台场景的变化可能会使特定模型失效。最后,我们指出了一些未来可能的方向,例如专注于创建可重复的流程;使用 SM 数据作为统计模型的一部分,而不是民意调查;输入数据源多样化,包括多个SM平台和民意调查、经济指标等非SM数据;使用机器学习进行投票份额回归,而不是进行情绪分析;并处理高度分歧的民意调查结果的不确定性。
更新日期:2023-12-23
down
wechat
bug