Skip to main content
Log in

Deep Reinforcement Learning with Inverse Jacobian based Model-Free Path Planning for Deburring in Complex Industrial Environment

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

In this study, we present an innovative approach to robotic deburring path planning by combining deep reinforcement learning (DRL) with an inverse Jacobian strategy. Existing model-based path planning methods, including sampling-based approaches, often suffer from computational complexity and challenges in capturing the dynamics of deburring systems. To overcome these limitations, our novel DRL-based framework for path planning leverages experiential learning to identify optimal deburring trajectories without relying on predefined models. This model-free approach is particularly suited for complex deburring scenarios with unknown system dynamics. Additionally, we employ an inverse Jacobian technique with a time-varying gain module (η(t) = e^2t) during training, which yields remarkable benefits in terms of exploration–exploitation balance and collision avoidance, enhancing the overall performance of the DRL agent. Through a series of experiments conducted in a simulated environment, we evaluate the efficacy of our proposed algorithm for deburring path planning. Our modified DRL-based approach, utilizing inverse kinematics with a time-varying gain module, demonstrates superior performance in terms of convergence speed, optimality, and robustness when compared to conventional path planning methods. Notably, in comparison to algorithms like sampling-based strategies, our model-free DRL-based approach outperforms these methods, achieving an exceptional average success rate of 97%. The integration of the inverse Jacobian technique further enhances the effectiveness of our algorithm by effectively reducing the state space dimensionality, leading to improved learning efficiency and the generation of optimal deburring trajectories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

Data is available with authors and can be shared on a decent request.

References

  1. Ramachandran, N., Pande, S.S., Ramakrishnan, N.: The role of deburring in manufacturing: A state-of-the-art survey. J. Mater. Process. Technol. 44, 1–13 (1994). https://doi.org/10.1016/0924-0136(94)90033-7

    Article  Google Scholar 

  2. Ryuh, B.-S., Pennock, G.R.: Robot Automation Systems for Deburring. In: Kin, L. (ed.) Industrial Robotics: Programming, Simulation and Applications. Pro Literatur Verlag, Germany / ARS, Austria (2006)

  3. Jin, S.Y., Pramanik, A., Basak, A.K., Prakash, C., Shankar, S., Debnath, S.: Burr formation and its treatments—a review. Int. J. Adv. Manuf. Technol. 107, 2189–2210 (2020). https://doi.org/10.1007/s00170-020-05203-2

    Article  Google Scholar 

  4. Pan, Z., Polden, J., Larkin, N., Van Duin, S., Norrish, J.: Recent progress on programming methods for industrial robots. Robot. Comput. Integr. Manuf. 28, 87–94 (2012). https://doi.org/10.1016/j.rcim.2011.08.004

    Article  Google Scholar 

  5. Fragkopoulos, C., Gräser, A.: Sampling based path planning for high DoF manipulators without goal configuration. IFAC Proc. Vol. 44, 11568–11573 (2011). https://doi.org/10.3182/20110828-6-IT-1002.00474

    Article  Google Scholar 

  6. LaValle SM. Planning Algorithms. Cambridge: Cambridge University Press. (2006). https://doi.org/10.1017/CBO9780511546877

  7. Kavraki, L.E., Svestka, P., Latombe, J.-C., Overmars, M.H.: Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Trans. Robot. Automat. 12, 566–580 (1996). https://doi.org/10.1109/70.508439

    Article  Google Scholar 

  8. Amato, N.M., Wu, Y.: A randomized roadmap method for path and manipulation planning. In: Proceedings of IEEE International Conference on Robotics and Automation. pp. 113–120. IEEE, Minneapolis, MN, USA (1996)

  9. Yoshida, E., Esteves, C., Belousov, I., Laumond, J.-P., Sakaguchi, T., Yokoi, K.: Planning 3-D collision-free dynamic robotic motion through iterative reshaping. IEEE Trans. Robot. 24, 1186–1198 (2008). https://doi.org/10.1109/TRO.2008.2002312

    Article  Google Scholar 

  10. Damion, D., Charmane, V., Emmanuel, G., Jr., Chuy, O.: Motion planning for mobile robots via sampling-based model predictive optimization. In: Topalov, A. (ed.) Recent Advances in Mobile Robotics. InTech (2011). https://doi.org/10.5772/17790.

  11. Cao, X., Zou, X., Jia, C., Chen, M., Zeng, Z.: RRT-based path planning for an intelligent litchi-picking manipulator. Comput. Electron. Agric. 156, 105–118 (2019). https://doi.org/10.1016/j.compag.2018.10.031

    Article  Google Scholar 

  12. Akbaripour, H., Masehian, E.: Semi-lazy probabilistic roadmap: a parameter-tuned, resilient and robust path planning method for manipulator robots. Int. J. Adv. Manuf. Technol. 89, 1401–1430 (2017). https://doi.org/10.1007/s00170-016-9074-6

    Article  Google Scholar 

  13. Zhang, H., Wang, Y., Zheng, J., Yu, J.: Path planning of industrial robot based on improved RRT algorithm in complex environments. IEEE Access. 6, 53296–53306 (2018). https://doi.org/10.1109/ACCESS.2018.2871222

    Article  Google Scholar 

  14. Wei, K., Ren, B.: A method on dynamic path planning for robotic manipulator autonomous obstacle avoidance based on an improved RRT algorithm. Sensors. 18, 571 (2018). https://doi.org/10.3390/s18020571

    Article  Google Scholar 

  15. Zhang, Q., Yue, S., Yin, Q., Zha, Y.: Dynamic obstacle-avoiding path planning for robots based on modified potential field method. In: Huang, D.-S., Jo, K.-H., Zhou, Y.-Q., Han, K. (eds.) Intelligent Computing Theories and Technology, pp. 332–342. Springer (2013)

    Chapter  Google Scholar 

  16. Orozco-Rosas, U., Montiel, O., Sepúlveda, R.: Mobile robot path planning using membrane evolutionary artificial potential field. Appl. Soft Comput. 77, 236–251 (2019). https://doi.org/10.1016/j.asoc.2019.01.036

    Article  Google Scholar 

  17. Li, H., Wang, Z., Ou, Y.: Obstacle avoidance of manipulators based on improved artificial potential field method. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO). pp. 564–569. IEEE, Dali, China (2019)

  18. Liu, S., Zhang, Q., Zhou, D.: Obstacle avoidance path planning of space manipulator based on improved artificial potential field method. J. Inst. Eng. India Ser. C. 95, 31–39 (2014). https://doi.org/10.1007/s40032-014-0099-z

    Article  Google Scholar 

  19. Lu, C., Wang, K., Xu, H.: Trajectory tracking of manipulators based on improved robust nonlinear predictive control. In: 2020 International Conference on Control, Robotics and Intelligent System. pp. 6–12. ACM, Xiamen China (2020)

  20. Elsisi, M.: Optimal design of nonlinear model predictive controller based on new modified multitracker optimization algorithm. Int. J. Intell. Syst. 35, 1857–1878 (2020). https://doi.org/10.1002/int.22275

    Article  Google Scholar 

  21. Hsueh, H.-Y., Toma, A.-I., Jaafar, H.A., Stow, E., Murai, R., Kelly, P.H.J., Saeedi, S.: Systematic comparison of path planning algorithms using PathBench. (2022). https://doi.org/10.48550/ARXIV.2203.03092

  22. Qureshi, A.H., Simeonov, A., Bency, M.J., Yip, M.C.: Motion planning networks. (2018). https://doi.org/10.48550/ARXIV.1806.05767

  23. Wu, D., Lei, Y., He, M., Zhang, C., Ji, L.: Deep reinforcement learning-based path control and optimization for unmanned ships. Wirel. Commun. Mob. Comput. 2022, 1–8 (2022). https://doi.org/10.1155/2022/7135043

    Article  Google Scholar 

  24. Li, L., Wu, D., Huang, Y., Yuan, Z.-M.: A path planning strategy unified with a COLREGS collision avoidance function based on deep reinforcement learning and artificial potential field. Appl. Ocean Res. 113, 102759 (2021). https://doi.org/10.1016/j.apor.2021.102759

    Article  Google Scholar 

  25. Xie, R., Meng, Z., Wang, L., Li, H., Wang, K., Wu, Z.: Unmanned aerial vehicle path planning algorithm based on deep reinforcement learning in large-scale and dynamic environments. IEEE Access. 9, 24884–24900 (2021). https://doi.org/10.1109/ACCESS.2021.3057485

    Article  Google Scholar 

  26. Kim, M., Han, D.-K., Park, J.-H., Kim, J.-S.: Motion planning of robot manipulators for a smoother path using a twin delayed deep deterministic policy gradient with hindsight experience replay. Appl. Sci. 10, 575 (2020). https://doi.org/10.3390/app10020575

    Article  Google Scholar 

  27. Yan, C., Xiang, X., Wang, C.: Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. J. Intell. Robot. Syst. 98, 297–309 (2020). https://doi.org/10.1007/s10846-019-01073-3

    Article  Google Scholar 

  28. Ruan, X., Ren, D., Zhu, X., Huang, J.: Mobile robot navigation based on deep reinforcement learning. In: 2019 Chinese Control And Decision Conference (CCDC). pp. 6174–6178. IEEE, Nanchang, China (2019)

  29. Ma, J., Lu, H., Xiao, J., Zeng, Z., Zheng, Z.: Multi-robot target encirclement control with collision avoidance via deep reinforcement learning. J. Intell. Robot. Syst. 99, 371–386 (2020). https://doi.org/10.1007/s10846-019-01106-x

    Article  Google Scholar 

  30. Ugurlu, H.I., Kalkan, S., Saranli, A.: Reinforcement learning versus conventional control for controlling a planar bi-rotor platform with tail appendage. J. Intell. Robot. Syst. 102, 77 (2021). https://doi.org/10.1007/s10846-021-01412-3

    Article  Google Scholar 

  31. Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: A survey. Int. J. Robot. Res. 32, 1238–1274 (2013). https://doi.org/10.1177/0278364913495721

    Article  Google Scholar 

  32. Golluccio, G., Di Lillo, P., Di Vito, D., Marino, A., Antonelli, G.: Objects relocation in clutter with robot manipulators via tree-based Q-Learning algorithm: analysis and experiments. J. Intell. Robot. Syst. 106, 44 (2022). https://doi.org/10.1007/s10846-022-01719-9

    Article  Google Scholar 

  33. Zheng, L., Wang, Y., Yang, R., Wu, S., Guo, R., Dong, E.: An efficiently convergent deep reinforcement learning-based trajectory planning method for manipulators in dynamic environments. J. Intell. Robot. Syst. 107, 50 (2023). https://doi.org/10.1007/s10846-023-01822-5

    Article  Google Scholar 

  34. Belge, E., Altan, A., Hacıoğlu, R.: Metaheuristic optimization-based path planning and tracking of quadcopter for payload hold-release mission. Electronics 11, 1208 (2022). https://doi.org/10.3390/electronics11081208

    Article  Google Scholar 

  35. Altan, A., Hacıoğlu, R.: Model predictive control of three-axis gimbal system mounted on UAV for real-time target tracking under external disturbances. Mech. Syst. Signal Process. 138, 106548 (2020). https://doi.org/10.1016/j.ymssp.2019.106548

    Article  Google Scholar 

  36. Altan, A., Aslan, O., Hacioglu, R.: Real-Time Control based on NARX Neural Network of Hexarotor UAV with Load Transporting System for Path Tracking. In: 2018 6th International Conference on Control Engineering & Information Technology (CEIT). pp. 1–6. IEEE, Istanbul, Turkey (2018)

  37. Shakya, A.K., Pillai, G., Chakrabarty, S.: Reinforcement learning algorithms: A brief survey. Expert Syst. Appl. 231, 120495 (2023). https://doi.org/10.1016/j.eswa.2023.120495

    Article  Google Scholar 

  38. Tutsoy, O., Brown, M.: Reinforcement learning analysis for a minimum time balance problem. Trans. Inst. Meas. Control. 38, 1186–1200 (2016). https://doi.org/10.1177/0142331215581638

    Article  Google Scholar 

  39. Tutsoy, O., Barkana, D.E., Balikci, K.: A novel exploration-exploitation-based adaptive law for intelligent model-free control approaches. IEEE Trans. Cybern. 53, 329–337 (2023). https://doi.org/10.1109/TCYB.2021.3091680

    Article  Google Scholar 

  40. Liu, Y., Gao, P., Zheng, C., Tian, L., Tian, Y.: A deep reinforcement learning strategy combining expert experience guidance for a fruit-picking manipulator. Electronics 11, 311 (2022). https://doi.org/10.3390/electronics11030311

    Article  Google Scholar 

  41. Chen, L., Jiang, Z., Cheng, L., Knoll, A.C., Zhou, M.: Deep reinforcement learning based trajectory planning under uncertain constraints. Front. Neurorobot. 16, 883562 (2022). https://doi.org/10.3389/fnbot.2022.883562

    Article  Google Scholar 

  42. Sangiovanni, B., Incremona, G.P., Piastra, M., Ferrara, A.: Self-configuring robot path planning with obstacle avoidance via deep reinforcement learning. IEEE Control Syst. Lett. 5, 397–402 (2021). https://doi.org/10.1109/LCSYS.2020.3002852

    Article  Google Scholar 

  43. Wu, J., Wu, Q.M.J., Chen, S., Pourpanah, F., Huang, D.: A-TD3: an adaptive asynchronous twin delayed deep deterministic for continuous action spaces. IEEE Access. 10, 128077–128089 (2022). https://doi.org/10.1109/ACCESS.2022.3226446

    Article  Google Scholar 

  44. Hayat, A.A., Chittawadigi, R.G., Udai, A.D., Saha, S.K.: Identification of Denavit-Hartenberg Parameters of an Industrial Robot. In: Proceedings of Conference on Advances In Robotics. pp. 1–6. ACM, Pune India (2013)

  45. Siciliano, B., Khatib, O. (eds.): Springer Handbook of Robotics. Springer International Publishing, Cham (2016)

    Google Scholar 

  46. Puterman, M.L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons Ltd., New York, (1994). https://doi.org/10.1002/9780470316887

  47. Karaman, S., Frazzoli, E.: Sampling-based algorithms for optimal motion planning. Int. J. Robot. Res. 30, 846–894 (2011). https://doi.org/10.1177/0278364911406761

    Article  Google Scholar 

Download references

Acknowledgements

The authors thank Visvesvaraya National Institute of Technology, Nagpur, India, for providing the necessary infrastructure and computational facility to conduct this research. However, no funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Contributions

Rahul M R (Research Scholar) collected the data, implemented the proposed methodology, analyzed the results and drafted the manuscript.

Dr Shital S Chiddarwar (Supervisor) conceptualized and designed the methodology for the research work, helped interpret the results, and reviewed and edited the manuscript to its final form.

Corresponding author

Correspondence to Shital S. Chiddarwar.

Ethics declarations

Ethical and Informed Consent for Data Used

Not Applicable.

Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahul, M.R., Chiddarwar, S.S. Deep Reinforcement Learning with Inverse Jacobian based Model-Free Path Planning for Deburring in Complex Industrial Environment. J Intell Robot Syst 110, 4 (2024). https://doi.org/10.1007/s10846-023-02030-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-023-02030-x

Keywords

Navigation