Distributed web hacking by adaptive consensus-based reinforcement learning,Artificial Intelligence

当前位置： X-MOL 学术 › Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Distributed web hacking by adaptive consensus-based reinforcement learning
Artificial Intelligence ( IF 14.4 ) Pub Date : 2023-10-24 , DOI: 10.1016/j.artint.2023.104032
Nemanja Ilić , Dejan Dašić , Miljan Vučetić , Aleksej Makarov , Ranko Petrović

In this paper, we propose a novel adaptive consensus-based learning algorithm for automated and distributed web hacking. We aim to assist ethical hackers in conducting legitimate penetration testing and improving web security by identifying system vulnerabilities at an early stage. Ethical hacking is modeled as a capture-the-flag style task addressed within a distributed reinforcement learning framework. To achieve our goal, we employ interconnected intelligent agents that interact with their copies of the environment simultaneously to reach the target. They perform local information processing to optimize their policies and exchange information with neighboring agents. We propose a novel adaptive consensus scheme for inter-agent communications, which enables the agents to efficiently share network-wide information in a decentralized manner. The scheme dynamically adjusts its weights based on heuristics, involving both recency and frequency metrics of actions selected at a given state by an individual agent, similar to eligibility traces. We extensively analyze the convergence properties of our algorithm and introduce a new communication scheme design. We demonstrate that this design ensures the fastest convergence to the desired asymptotic values under the general setting of asymmetric communication topologies. Additionally, we provide a comprehensive review of the current state of the field and propose a web agent model with improved scalability compared to existing solutions. Numerical simulations are conducted to illustrate the key characteristics of our algorithm. The results demonstrate that it outperforms both non-cooperative and average consensus schemes. Moreover, our algorithm significantly reduces hacking times when compared to baseline algorithms that rely on more complex models. These findings offer valuable insights to system security administrators, enabling them to address identified shortcomings and vulnerabilities effectively.

中文翻译：

通过基于自适应共识的强化学习进行分布式网络黑客攻击

在本文中，我们提出了一种新颖的自适应基于共识的学习算法，用于自动化和分布式网络黑客攻击。我们的目标是通过早期识别系统漏洞来协助道德黑客进行合法的渗透测试并提高网络安全性。道德黑客被建模为在分布式强化学习框架内解决的夺旗式任务。为了实现我们的目标，我们采用互连的智能代理，它们同时与其环境副本进行交互以达到目标。他们执行本地信息处理以优化其策略并与邻近代理交换信息。我们提出了一种用于代理间通信的新颖的自适应共识方案，该方案使代理能够以分散的方式有效地共享网络范围的信息。该方案根据启发法动态调整其权重，涉及单个代理在给定状态下选择的操作的新近度和频率度量，类似于资格跟踪。我们广泛分析了算法的收敛特性，并引入了新的通信方案设计。我们证明，这种设计可确保在非对称通信拓扑的一般设置下最快收敛到所需的渐近值。此外，我们对该领域的当前状态进行了全面审查，并提出了一种与现有解决方案相比具有更高可扩展性的网络代理模型。进行数值模拟来说明我们算法的关键特征。结果表明，它优于非合作和平均共识方案。此外，与依赖更复杂模型的基线算法相比，我们的算法显着减少了黑客攻击时间。这些发现为系统安全管理员提供了宝贵的见解，使他们能够有效地解决已发现的缺陷和漏洞。

更新日期：2023-10-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>