Abstract
With the development of artificial intelligence (AI) technology, the autonomous navigation and behavior decision-making capabilities of MASS (marine autonomous surface ship) are constantly being innovated, thereby ensuring their safe navigation. However, the recent algorithms exhibit limited efficacy in navigating in unknown and complex environments, while also lacking the capability to effectively handle the encounters resulting from the uncertain behavior of other ships. Consequently, this study proposes an intelligent navigation methodology utilizing the PRM (Probabilistic Roadmap) and PPO (Proximal Policy Optimization) algorithm to facilitate autonomous navigation and collision avoidance decision-making for MASS. Moreover, the MASS disciplined behaviors prescribed by COLREGs are taken into the consideration of the reward function design. Particularly, in extreme encounter situation, it becomes necessary for MASS to depart from COLREGs, thus requiring a corresponding definition of the reward function. Finally, the autonomous navigation and decision-making capability of the MASS is evaluated using real-time ship traffic in a voyage scenario, while various extreme encounter situations are also simulated to demonstrate the generality and practicality of the proposed PRM-PPO method.
Similar content being viewed by others
Data availability
Data will be made available on request.
References
Chauvin C, Lardjane S, Morel G et al (2013) Human and organisational factors in maritime accidents: Analysis of collisions at sea using the HFACS. Acc Anal Prev 59:26–37
Qu Y, Cai LL (2022) Real-time emergency collision avoidance for unmanned surface vehicles with COLREGS flexibly obeyed. J Mar Sci Eng 10(12):2025
Xie W, Fang X, Wu S (2020) 2.5D navigation graph and improved a-star algorithm for path planning in ship inside virtual environment. In: 2020 Prognostics and Health Management Conference (PHM-Besançon), 2020
He Z, Liu C, Chu X et al (2022) Dynamic anti-collision A-star algorithm for multi-ship encounter situations. Appl Ocean Res 118:102995
Chen XJ, Liu YX, Hong XB et al (2018) Unmanned ship path planning based on RRT. In: 14th International Conference on Intelligent Computing (ICIC 2018), Springer, Cham, 2018
Zhang X, Chen X (2021) Path planning method for unmanned surface vehicle based on RRT* and DWA. In: International Conference on Multimedia Technology and Enhanced Learning (ICMTEL2021), Springer, Cham, 2021
He ZB, Chu XM, Liu CG et al (2023) A novel model predictive artificial potential field based ship motion planning method considering COLREGs for complex encounter scenarios. ISA Trans 134:58–73
Han S, Wang L, Wang Y et al (2022) A dynamically hybrid path planning for unmanned surface vehicles based on non-uniform Theta* and improved dynamic windows approach. Ocean Eng Aug.1 Pt.2:257
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518:529–533
Silver D, Hubert T, Schrittwieser J et al (2018) A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362(6419):1140
Li L, Wu D, Huang Y et al (2021) A path planning strategy unified with a COLREGS collision avoidance function based on deep reinforcement learning and artificial potential field. Appl Ocean Res 113:102759
Shen H, Hashimoto H, Matsuda A et al (2019) Automatic collision avoidance of multiple ships based on deep Q-learning. Appl Ocean Res 86:268–288
Zhai PY, Zhang YJ, Wang SB (2022) Intelligent Ship collision avoidance algorithm based on DDQN with prioritized experience replay under COLREGs. J Mar Sci Eng 10(5):585
Xu X, Lu Y, Liu G et al (2022) COLREGs-abiding hybrid collision avoidance algorithm based on deep reinforcement learning for USVs. Ocean Eng 247:110749
Zhao YM, Han FL, Han DF et al (2022) Decision-making for the autonomous navigation of USVs based on deep reinforcement learning under IALA maritime buoyage system. Ocean Eng 266:112557
Guan W, Cui ZW, Zhang XK (2022) Intelligent smart marine autonomous surface ship decision system based on improved PPO ALgorithm. Sensors 22(15):5732
Guan W, Peng HW, Zhang XK et al (2022) Ship steering adaptive CGS control based on EKF identification method. J Mar Sci Eng 10(2):294
Goodwin E (1975) A statistical study of ship domains. J Navig 28(3):328–344
Rongcai Z, Hongwei X, Kexin Y (2023) Autonomous collision avoidance system in a multi-ship environment based on proximal policy optimization method. Ocean Eng 272:113779
Kuwata Y, Wolf MT, Zarzhitsky D et al (2014) Safe maritime autonomous navigation with COLREGS, using velocity obstacles. IEEE J Oceanic Eng 39(1):110–119
Karaman S, Frazzoli E (2011) Sampling-based algorithms for optimal motion planning. Int J Robot Res 30(7):846–894
Duchon F, Babinec A, Kajan M et al (2014) Path planning with modified A star algorithm for a mobile robot. Proc Eng 96:56–59
Thorp HH (2023) ChatGPT is fun, but not an author. Science 379:313–313
Zhao LM, Roh MI (2019) COLREGs-compliant multiship collision avoidance based on deep reinforcement learning. Ocean Eng 191:106436
Wang CB, Zhang XY, Yang ZL et al (2023) Collision avoidance for autonomous ship using deep reinforcement learning and prior-knowledge-based approximate representation. Front Mar Sci 9:1084763
Johansen TA, Perez T, Cristofaro A (2016) Ship collision avoidance and COLREGS compliance using simulation-based control behavior selection with predictive hazard assessment. IEEE Trans Intell Transp Syst 17(12):3407–3422
Christiano PF, Leike J, Brown TB et al (2023) Deep reinforcement learning from human preferences. arXiv preprint arXiv: 1706.03741
Zheng Z, Oh J, Singh S (2018) On learning intrinsic rewards for policy gradient methods. arXiv preprint arXiv: 1804.06459
Zheng Z, Oh J, Hessel M et al (2019) What can learned intrinsic rewards capture. arXiv preprint arXiv:1912.05500
Guan W, Zhao MY, Zhang CB et al (2023) Generalized behavior decision-making model for ship collision avoidance via reinforcement learning method. J Mar Sci Eng 11(2):273
Duan J, Shi D, Diao R et al (2020) Deep-reinforcement-learning-based autonomous voltage control for power grid operations. IEEE Trans Power Syst 35(1):814–817
Schulman J, Wolski F, Dhariwal S et al (2017) Proximal policy optimization algorithms. arXiv preprint arXiv: 1707.06347
Kavraki LE, Svestka P, Latombe JC et al (1996) Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Trans Robot Autom 12:566–580
Funding
This work was funded by National Natural Science Foundation of China to Wei Guan with Grant number 52171342 and by DMU navigation college first-class interdisciplinary research project to Wei Guan with Grant number 2023JXA03.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Guan, W., Han, H. & Cui, Z. Autonomous navigation of marine surface vessel in extreme encounter situation. J Mar Sci Technol 29, 167–180 (2024). https://doi.org/10.1007/s00773-023-00979-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00773-023-00979-w