Skip to main content
Log in

A path planning method based on deep reinforcement learning for crowd evacuation

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Deep reinforcement learning (DRL) is suitable for solving complex path-planning problems due to its excellent ability to make continuous decisions in a complex environment. However, the increase in the population size in the crowd evacuation path-planning problem causes a substantial computational burden for the algorithm, which leads to an unsatisfactory efficiency of the current DRL algorithm. This paper presents a path planning method based on DRL for crowd evacuation to solve the problem. First, we divide crowds into groups based on their relationship and distance from each other and select leaders from them. Next, we expand the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to propose an Optimized Multi-Agent Deep Deterministic Policy Gradient (OMADDPG) algorithm to obtain the global evacuation path. The OMADDPG algorithm uses the Cross-Entropy Method (CEM) to optimize policy and improve the neural network’s training efficiency by applying the Data Pruning (DP) algorithm. In addition, the social force model is improved, incorporating the relationship between individuals and psychological factors into the model. Finally, this paper combines the improved social force model and the OMADDPG algorithm. The OMADDPG algorithm transmits the path information to the leaders. Pedestrians in the environment are driven by the improved social force model to follow the leaders to complete the evacuation simulation. The method can use a leader to guide pedestrians safely arrive the exit and reduce evacuation time in different environments. The simulation results prove the efficiency of the path planning method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Code or data availability

The data and code used and analysed during the study are available from the first author on reasonable request.

References

  • Chen AY, He JT, Liang MC, Su GF (2021) Crowd response considering herd effect and exit familiarity under emergent occasions: a case study of an evacuation drill experiment. Phys A Stat Mech Appl 556(15):124654

    Google Scholar 

  • Gao J, Gong J, Qing Q (2021) Coupling evacuation model of air-supported membrane buildings subjected to air-leakage based on multi-velocity cellular automaton. Simul Model Pract Theory 108:102257

    Article  Google Scholar 

  • Göçeri E (2021a) An application for automated diagnosis of facial dermatological diseases. İzmir Katip Çelebi Univ Fac Health Sci J 6(3):91–99

    Google Scholar 

  • Göçeri E (2021b) Diagnosis of skin diseases in the era of deep learning and mobile technology. Comput Biol Med 134:104458

    Article  Google Scholar 

  • Gul F, Rahiman W, Alhady SSN et al (2021) Meta-heuristic approach for solving multi-objective path planning for autonomous guided robot using PSO-GWO optimization algorithm with evolutionary programming. J Ambient Intell Human Comput 12:7873–7890

    Article  Google Scholar 

  • Helbing D, Molnár P (1995) Social force model for pedestrian dynamics. Phys Rev E 51(5):4282–4286

    Article  Google Scholar 

  • Hu J, Niu H, Carrasco J, Lennox B, Arvin F (2020) Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning. IEEE Trans Veh Technol 69(12):14413–14423

    Article  Google Scholar 

  • Jardine PT, Givigi S (2021) Improving control performance of unmanned aerial vehicles through shared experience. J Intell Robot Syst 102:68

    Article  Google Scholar 

  • Ji Q, Gao C (2007) Simulating crowd evacuation with a leader-follower model. IJCSES Int J Comput Sci Eng Syst CSES Int 411:63–73

    Google Scholar 

  • Jiang Y, Chen B, Li X, Ding Z (2020) Dynamic navigation field in the social force model for pedestrian evacuation. Appl Math Model 80:815–826

    Article  MathSciNet  Google Scholar 

  • Jiao Z, Oh J (2019) End-to-end reinforcement learning for multi-agent continuous control. In: Proceedings of 18th IEEE international conference on machine learning and applications, ICMLA 2019, pp 535–540

  • Küllü K, Güdükbay U, Manocha D (2017) ACMICS: an agent communication model for interacting crowd simulation. Auton Agent Multi-Agent Syst 31(6):1403–1423

    Article  Google Scholar 

  • Liu H, Liu BX, Zhang H, Li L, Qin X, Zhang GJ (2018) Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism. Inf Sci (NY) 436–437:247–267

    Article  MathSciNet  Google Scholar 

  • Lowe R, Wu Y, Tamar A et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inf Process Syst 30:6379–6390

    Google Scholar 

  • Madani Y, Ezzikouri H, Erritali M et al (2020) Finding optimal pedagogical content in an adaptive e-learning platform using a new recommendation approach and reinforcement learning. J Ambient Intell Human Comput 11:3921–3936

    Article  Google Scholar 

  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  • Oğuz O, Akaydın A, Yılmaz T, Güdükbay U (2010) Emergency crowd simulation for outdoor environments. Comput Graph 34(2):136–144

    Article  Google Scholar 

  • Sun W, Zang W, Liu C et al (2021) Motion pattern optimization and energy analysis for underwater glider based on the multi-objective artificial bee colony method. J Mar Sci Eng 9(3):327–345

    Article  Google Scholar 

  • Talaat FM, Saraya MS, Saleh AI et al (2020) A load balancing and optimization strategy (LBOS) using reinforcement learning in fog computing environment. J Ambient Intell Human Comput 11:4951–4966

    Article  Google Scholar 

  • Tang TQ, Zhang BT, Xie CZ (2019) Modeling and simulation of pedestrian flow in university canteen. Simul Model Pract Theory 95:96–111

    Article  Google Scholar 

  • Wan K, Gao X, Hu Z, Wu G (2020) Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning. Remote Sens 12(4):640–661

    Article  Google Scholar 

  • Weidmann U (1993) Transporttechnik der fußgänger: transporttechnische eigenschaften des fußgängerverkehrs, literaturauswertung. IVT Schriftenreihe. https://doi.org/10.3929/ethz-b-000242008

  • Xu H, Wang N, Zhao H, Zheng Z (2019) Deep reinforcement learning-based path planning of under actuated surface vessels. Cyber Phys Syst 5(1):1–17

    Article  Google Scholar 

  • Xu Z, Deng D, Shimada K (2021) Autonomous UAV exploration of dynamic environments via incremental sampling and probabilistic roadmap. IEEE Robot Autom Lett 6(2):2729–2736

    Article  Google Scholar 

  • Yan C, Xiang X, Wang C (2020) Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. J Intell Robot Syst 98:297–309

    Article  Google Scholar 

  • Yao ZZ, Zhang GJ, Lu DJ, Liu H (2019) Data-driven crowd evacuation: a reinforcement learning method. Neurocomputing 366:314–327

    Article  Google Scholar 

  • Yao ZZ, Zhang GJ, Lu DJ, Liu H (2020) Learning crowd behavior from real data: a residual network method for crowd simulation. Neurocomputing 50:2633–2646

    Google Scholar 

  • Zhang H, Liu H, Qin X, Liu BX (2018) Modified two-layer social force model for emergency earthquake evacuation. Phys A Stat Mech Appl 492:1107–1119

    Article  Google Scholar 

  • Zhao Y, Liu H, Gao KZ (2021) An evacuation simulation method based on an improved artificial bee colony algorithm and a social force model. Appl Intell 51:100–123

    Article  Google Scholar 

  • Zhou M, Dong HR, Ioannou PA, Zhao Y, Wang FY (2019) Guided crowd evacuation: approaches and challenges. IEEE/CAA J Autom Sin 6(5):1081–1094

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China (62276156, 61876102, 61972237).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hong Liu or Wenhao Li.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Consent to participate

All authors have read and agreed to publish this work.

Consent for publication

All authors have read and agreed to publish this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meng, X., Liu, H. & Li, W. A path planning method based on deep reinforcement learning for crowd evacuation. J Ambient Intell Human Comput (2024). https://doi.org/10.1007/s12652-024-04787-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12652-024-04787-x

Keywords

Navigation