当前位置: X-MOL 学术Sensors › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inspection Robot Navigation Based on Improved TD3 Algorithm
Sensors ( IF 3.9 ) Pub Date : 2024-04-15 , DOI: 10.3390/s24082525
Bo Huang 1 , Jiacheng Xie 1 , Jiawei Yan 1
Affiliation  

The swift advancements in robotics have rendered navigation an essential task for mobile robots. While map-based navigation methods depend on global environmental maps for decision-making, their efficacy in unfamiliar or dynamic settings falls short. Current deep reinforcement learning navigation strategies can navigate successfully without pre-existing map data, yet they grapple with issues like inefficient training, slow convergence, and infrequent rewards. To tackle these challenges, this study introduces an improved two-delay depth deterministic policy gradient algorithm (LP-TD3) for local planning navigation. Initially, the integration of the long–short-term memory (LSTM) module with the Prioritized Experience Re-play (PER) mechanism into the existing TD3 framework was performed to optimize training and improve the efficiency of experience data utilization. Furthermore, the incorporation of an Intrinsic Curiosity Module (ICM) merges intrinsic with extrinsic rewards to tackle sparse reward problems and enhance exploratory behavior. Experimental evaluations using ROS and Gazebo simulators demonstrate that the proposed method outperforms the original on various performance metrics.

中文翻译:

基于改进TD3算法的巡检机器人导航

机器人技术的快速进步使得导航成为移动机器人的一项重要任务。虽然基于地图的导航方法依赖于全球环境地图进行决策,但它们在不熟悉或动态环境中的功效不足。当前的深度强化学习导航策略可以在没有预先存在的地图数据的情况下成功导航,但它们面临着训练效率低、收敛速度慢和奖励不频繁等问题。为了应对这些挑战,本研究引入了一种改进的两延迟深度确定性策略梯度算法(LP-TD3),用于局部规划导航。最初,将长短期记忆(LSTM)模块与优先体验重放(PER)机制集成到现有的TD3框架中,以优化训练并提高经验数据利用效率。此外,内在好奇心模块(ICM)的结合将内在奖励与外在奖励相结合,以解决稀疏奖励问题并增强探索行为。使用 ROS 和 Gazebo 模拟器进行的实验评估表明,所提出的方法在各种性能指标上都优于原始方法。
更新日期:2024-04-16
down
wechat
bug