Abstract
Critical functionality and huge influence of the hot trend/topic page (HTP) in microblogging sites have driven the creation of a new kind of underground service called the bogus traffic service (BTS). BTS provides a kind of illegal service which hijacks the HTP by pushing the controlled topics into it for malicious customers with the goal of guiding public opinions. To hijack HTP, the agents of BTS maintain an army of black-market accounts called bogus traffic accounts (BTAs) and control BTAs to generate a burst of fake traffic by massively retweeting the tweets containing the customer desired topic (hashtag). Although this service has been extensively exploited by malicious customers, little has been done to understand it. In this paper, we conduct a systematic measurement study of the BTS. We first investigate and collect 125 BTS agents from a variety of sources and set up a honey pot account to capture BTAs from these agents. We then build a BTA detector that detects 162 218 BTAs from Weibo, the largest Chinese microblogging site, with a precision of 94.5%. We further use them as a bridge to uncover 296 916 topics that might be involved in bogus traffic. Finally, we uncover the operating mechanism from the perspectives of the attack cycle and the attack entity. The highlights of our findings include the temporal attack patterns and intelligent evasion tactics of the BTAs. These findings bring BTS into the spotlight. Our work will help in understanding and ultimately eliminating this threat.
摘要
由于热门趋势/话题页在在线社交网络平台中的巨大影响力,一种名为社交网络虚假流量服务的新的灰黑色产业应运而生。社交网络虚假流量服务提供了一种恶意服务使得想引导舆论的恶意客户将其给定话题推送到社交网络热门趋势/话题页。为达成他们劫持社交网络热门趋势/话题页,这些服务的提供商维持着一支被称为“虚假流量账户”的恶意账户大军,他们控制这些账户,通过短时间内大量转发含有客户所需话题(标签)的推文产生大量虚假流量。尽管这项服务已经广泛影响了社交网络生态,但人们对它知之甚少。本文对社交网络虚假流量服务进行系统性的测量研究。首先调查并发现不同来源的125个社交网络虚假流量提供商,并设立一个蜜罐账户捕获这些提供商控制的恶意账户。之后,建立了一个社交网络虚假流量检测器,从中国最大的微博网站新浪微博中检测出162 218个恶意账户,检测精度达到94.5%。进一步利用这些恶意账户作为桥梁,发现了296 916个可能涉及虚假流量的话题。最后,从攻击周期和攻击实体的角度揭示了社交网络虚假流量灰黑色产业链的运行机制。其中,发现了涉及社交网络虚假流量的恶意账户的时间性攻击模式和智能规避战术。这些发现使得社交网络虚假流量的运行机制暴露在大众的视野下。基于这些发现,我们的工作将有助于理解并最终消除这种威胁。
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Ali Alhosseini S, Bin Tareaf R, Najafi P, et al., 2019. Detect me if you can: spam bot detection using inductive representation learning. Companion Proc World Wide Web Conf, p.148-153. https://doi.org/10.1145/3308560.3316504
Alibaba Inc., 2020. Alibaba Annual Report. https://static.alibabagroup.com/reports/fy2020/ar/ebook/en/index.html [Accessed on Feb. 23, 2022].
Alvisi L, Clement A, Epasto A, et al., 2013. SoK: the evolution of sybil defense via social networks. IEEE Symp on Security and Privacy, p.382-396. https://doi.org/10.1109/SP.2013.33
Beskow DM, Carley KM, 2019. Its all in a name: detecting and labeling bots by their name. Comput Math Organ Theory, 25(1):24–35. https://doi.org/10.1007/s10588-018-09290-1
Beskow DM, Carley KM, 2020. You are known by your friends: leveraging network metrics for bot detection in Twitter. In: Tayebi MA, Glässer U, Skillicorn DB (Eds.), Open Source Intelligence and Cyber Crime: Social Media Analytics. Springer, Switzerland, p.53–88. https://doi.org/10.1007/978-3-030-41251-7_3
Booij TM, Verburgh T, Falconieri F, et al., 2021. Get rich or keep tryin’ trajectories in dark net market vendor careers. IEEE European Symp on Security and Privacy Workshops, p.202-212. https://doi.org/10.1109/EuroSPW54576.2021.00028
Boshmaf Y, Logothetis D, Siganos G, et al., 2015. Integro: leveraging victim prediction for robust fake account detection in OSNs. Network and Distributed System Security Symp, p.8-11. https://doi.org/10.14722/ndss.2015.23260
Cao Q, Yang XW, Yu JQ, et al., 2014. Uncovering large groups of active malicious accounts in online social networks. Proc ACM SIGSAC Conf on Computer and Communications Security, p.477-488. https://doi.org/10.1145/2660267.2660269
Chen TQ, Guestrin C, 2016. XGBoost: a scalable tree boosting system. https://doi.org/10.48550/arXiv.1603.02754
Cresci S, di Pietro R, Petrocchi M, et al., 2017. The paradigm-shift of social spambots: evidence, theories, and tools for the arms race. Proc 26th Int Conf on World Wide Web Companion, p.963-972. https://doi.org/10.1145/3041021.3055135
Cresci S, Petrocchi M, Spognardi A, et al., 2019. On the capability of evolved spambots to evade detection via genetic engineering. Online Soc Netw Med, 9:1–16. https://doi.org/10.1016/j.osnem.2018.10.005
Cuevas A, Miedema F, Soska K, et al., 2022. Measurement by proxy: on the accuracy of online marketplace measurements. 31st USENIX Security Symp, p.2153-2170.
de Cristofaro E, Friedman A, Jourjon G, et al., 2014. Paying for likes? Understanding Facebook like fraud using honeypots. Proc Conf on Internet Measurement Conf, p.129-136. https://doi.org/10.1145/2663716.2663729
Devlin J, Chang MW, Lee K, et al., 2018. BERT: pre-training of deep bidirectional Transformers for language understanding. https://doi.org/10.48550/arXiv.1810.04805
Dutta HS, Chakraborty T, 2020. Blackmarket-driven collusion among retweeters—analysis, detection, and characterization. IEEE Trans Inform Forens Secur, 15:1935–1944. https://doi.org/10.1109/TIFS.2019.2953331
Elmas T, Overdorf R, Özkalay AF, et al., 2021. Ephemeral astroturfing attacks: the case of fake Twitter trends. IEEE European Symp on Security and Privacy, p.403-422. https://doi.org/10.1109/EuroSP51992.2021.00035
Feng SB, Wan HR, Wang NN, et al., 2021. TwiBot-20: a comprehensive Twitter bot detection benchmark. Proc 30th ACM Int Conf on Information & Knowledge Management, p.4485-4494. https://doi.org/10.1145/3459637.3482019
Feng SB, Tan ZX, Li R, et al., 2022. Heterogeneity-aware Twitter bot detection with relational graph transformers. Proc AAAI Conf Artif Intell, 36(4):3977–3985. https://doi.org/10.1609/aaai.v36i4.20314
Feng SB, Tan ZX, Wan HR, et al., 2023. TwiBot-22: towards graph-based Twitter bot detection. https://doi.org/10.48550/arXiv.2206.04564
Freitas C, Benevenuto F, Ghosh S, et al., 2015. Reverse engineering socialbot infiltration strategies in Twitter. IEEE/ACM Int Conf on Advances in Social Networks Analysis and Mining, p.25-32. https://doi.org/10.1145/2808797.2809292
Guo ZY, Wang LQ, Wang YF, et al., 2018. Public opinion spamming: a model for content and users on Sina Weibo. Proc 10th ACM Conf on Web Science, p.210-214. https://doi.org/10.1145/3201064.3201104
HuggingFace, 2022. BERT Base Chinese Model. https://huggingface.co/bert-base-chinese [Accessed on May 26, 2022].
Jakesch M, Garimella K, Eckles D, et al., 2021. Trend alert: a cross-platform organization manipulated Twitter trends in the Indian general election. Proc ACM Human-Computer Interact, 5(CSCW2):379. https://doi.org/10.1145/3479523
JD Inc., 2020. JD Annual Report. https://ir.jd.com/static-files/fc93d5dd-9437-4141-9191-f960ba46874b [Accessed on May 26, 2022].
Just MR, Crigler AN, Metaxas P, et al., 2012. “It’s trending on Twitter”—an analysis of the Twitter manipulations in the Massachusetts 2010 Special Senate Election. Annual Meeting of the American Political Science Association.
Le QV, Mikolov T, 2014. Distributed representations of sentences and documents. https://arxiv.org/abs/1405.4053
Liu PF, Yuan WZ, Fu JL, et al., 2023. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput Surv, 55(9):195. https://doi.org/10.1145/3560815
Mihalcea R, Tarau P, 2004. TextRank bringing order into text. Proc Conf on Empirical Methods in Natural Language Processing, p.404-411. https://aclanthology.org/W04-3252
Mikolov T, Chen K, Corrado G, et al., 2013. Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781
PDD Inc., 2020. PDD Annual Report. https://investor.pddholdings.com/static-files/0ad89f79-7123-4072-8662-d5509227526c [Accessed on May 26, 2022].
Song J, Lee S, Kim J, 2015. CrowdTarget: target-based detection of crowdturfing in online social networks. Proc 22nd ACM SIGSAC Conf on Computer and Communications Security, p.793-804. https://doi.org/10.1145/2810103.2813661
Stringhini G, Wang G, Egele M, et al., 2013. Follow the green: growth and dynamics in Twitter follower markets. Proc Conf on Internet Measurement Conf, p.163-176. https://doi.org/10.1145/2504730.2504731
Thomas K, McCoy D, Grier C, et al., 2013. Trafficking fraudulent accounts: the role of the underground market in Twitter spam and abuse. Proc 22nd USENIX Conf on Security, p.195-210. https://doi.org/10.5555/2534766.2534784
Thomas K, Li F, Grier C, et al., 2014. Consequences of connectivity: characterizing account hijacking on Twitter. Proc ACM SIGSAC Conf on Computer and Communications Security, p.489-500. https://doi.org/10.1145/2660267.2660282
Torres-Lugo C, Yang KC, Menczer F, 2022. The manufacture of partisan echo chambers by follow train abuse on Twitter. Proc Int AAAI Conf Web Soc Med, 16(1):1017–1028. https://doi.org/10.1609/icwsm.v16i1.19354
van Wegberg R, Tajalizadehkhoob S, Soska K, et al., 2018. Plug and prey? Measuring the commoditization of cybercrime via online anonymous markets. Proc 27th USENIX Conf on Security Symp, p.1009-1026. https://doi.org/10.5555/3277203.3277279
Weerasinghe J, Flanigan B, Stein A, et al., 2020. The pod people: understanding manipulation of social media popularity via reciprocity abuse. Proc Web Conf, p.1874-1884. https://doi.org/10.1145/3366423.3380256
Woolley SC, 2016. Automating power: social bot interference in global politics. First Mond, 21(4). https://doi.org/10.5210/fm.v21i4.6161
Yang C, Harkreader R, Gu GF, 2013. Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans Inform Forens Secur, 8(8):1280–1293. https://doi.org/10.1109/TIFS.2013.2267732
Yu HF, Kaminsky M, Gibbons PB, et al., 2006. Sybil-Guard: defending against sybil attacks via social networks. SIGCOMM Comput Commun Rev, 36(4):267–278. https://doi.org/10.1145/1151659.1159945
Yu HF, Gibbons PB, Kaminsky M, et al., 2010. SybilLimit: a near-optimal social network defense against sybil attacks. IEEE/ACM Trans Netw, 18(3):885–898. https://doi.org/10.1109/TNET.2009.2034047
Yuan D, Miao YL, Gong NZ, et al., 2019. Detecting fake accounts in online social networks at the time of registrations. Proc ACM SIGSAC Conf on Computer and Communications Security, p.1423-1438. https://doi.org/10.1145/3319535.3363198
Zhang YB, Ruan X, Wang HN, et al., 2017. Twitter trends manipulation: a first look inside the security of Twitter trending. IEEE Trans Inform Forens Secur, 12(1):144–156. https://doi.org/10.1109/TIFS.2016.2604226
Zheng HZ, Xue MH, Lu H, et al., 2017. Smoke screener or straight shooter: detecting elite sybil attacks in userreview social networks. https://arxiv.org/abs/1709.06916
Acknowledgements
The authors would like to thank Haofei YU for suggestions on the detection method and Xueyan LYU for the investigation of the marketplace of BTS. The authors would also like to thank Tianyu DU and Yiming WU for their suggestions to revise the paper. We thank the support from the SRTP project in the College of Computer Science and Technology of Zhejiang University, and the NGICS platform of Zhejiang University.
Author information
Authors and Affiliations
Contributions
Ping HE designed the research, processed the data, conducted the experiments, and drafted the paper. Xuhong ZHANG, Changting LIN, Ting WANG, and Shouling JI helped organize the paper. Ping HE and Shouling JI revised and finalized the paper.
Corresponding author
Ethics declarations
Shouling JI is a corresponding expert of Frontiers of Information Technology & Electronic Engineering, and he was not involved with the peer review process of this paper. All the authors declare that they have no conflict of interest.
Additional information
List of supplementary materials
1 Communication channel analysis
2 Honeypot account
3 Evasive tweets
4 Profile-based features
5 Case study
6 Weibo authentication rules
Fig. S1 The announcement in our honeypot account
Fig. S2 Two examples of evasive tweets in our dataset
Fig. S3 Feature analysis of the profile-based features
Fig. S4 Bogus traffic distributions for three superstars
Supplementary materials for
Rights and permissions
About this article
Cite this article
He, P., Zhang, X., Lin, C. et al. Towards understanding bogus traffic service in online social networks. Front Inform Technol Electron Eng 25, 415–431 (2024). https://doi.org/10.1631/FITEE.2300068
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2300068