当前位置: X-MOL 学术J. Franklin Inst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Collision avoidance control for limited perception unmanned surface vehicle swarm based on proximal policy optimization
Journal of the Franklin Institute ( IF 4.1 ) Pub Date : 2024-03-10 , DOI: 10.1016/j.jfranklin.2024.106709
Mengmeng Yin , Yiyun Zhao , Fanbiao Li , Bin Liu , Chunhua Yang , Weihua Gui

In order to ensure the safe and coordinated operation of unmanned surface vehicle (USV) swarm in complex marine environments, the primary problem is collision avoidance control (CAC). However, the limited perception, environmental uncertainty and multi-source complexity bring significant challenges to the efficient collaboration and CAC of the USV swarm. To overcome the above challenges, this paper aims to propose a distributed CAC method for USVs based on the proximal policy optimization (PPO). This method does not necessitate precise system models and is capable of autonomous learning, effectively adapting to unknown environments. In terms of CAC, unlike designing reward functions based solely on the distance from obstacles, we additionally consider the velocity of obstacles, and combine optimal reciprocal collision avoidance (ORCA) to design a reward function. We further consider the limited perception range of USVs and construct a bidirectional gated recurrent unit (BiGRU) network to extract features of variable length observation data, effectively overcoming the problem of dimensionality in observable data. Moreover, we construct a high-quality USV swarm simulation environment using the Gazebo 3D physics engine, which is used for testing the generalization capability of collision avoidance policy. Finally, to verify the effectiveness of the policy learning and optimization, a series of experiments are conducted in various scenarios, network architectures, and control methods. The experimental results indicate that our approach has remarkable superiority in terms of travel time, average velocity, average reward, and success rate.

中文翻译:

基于近端策略优化的有限感知无人水面艇群避碰控制

为了保证无人水面艇(USV)群在复杂海洋环境下安全、协调运行,首要问题是防撞控制(CAC)。然而,有限的感知、环境的不确定性和多源的复杂性给USV群的高效协作和CAC带来了重大挑战。为了克服上述挑战,本文旨在提出一种基于近端策略优化(PPO)的无人艇分布式CAC方法。该方法不需要精确的系统模型,能够自主学习,有效适应未知环境。在CAC方面,与仅根据与障碍物的距离设计奖励函数不同,我们还考虑了障碍物的速度,并结合最优相互碰撞避免(ORCA)来设计奖励函数。我们进一步考虑USV有限的感知范围,构建双向门控循环单元(BiGRU)网络来提取可变长度观测数据的特征,有效克服可观测数据的维数问题。此外,我们使用Gazebo 3D物理引擎构建了高质量的USV群体仿真环境,用于测试避碰策略的泛化能力。最后,为了验证策略学习和优化的有效性,在各种场景、网络架构和控制方法下进行了一系列实验。实验结果表明,我们的方法在旅行时间、平均速度、平均奖励和成功率方面具有显着的优越性。
更新日期:2024-03-10
down
wechat
bug