Skip to main content
Log in

Using a half cheetah habitat for random augmentation computing

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Reinforcement learning algorithms that depend on physical models are thought to provide more substantial outcomes in comparison to model-free approaches when implemented on dynamic systems. Previous research has mostly focused on addressing self-management difficulties via the use of sophisticated neural network models. In order to get state-of-the-art outcomes, these models need a significant amount of training data. When using model-free policy search techniques, it is often considered that model-free reinforcement learning processes with policies that provide worse outcomes may be avoided by utilizing augmented random search (ARS), model-free. This technique improves the effectiveness and speeds up the training of linear methods for performing control tasks on the virtual physics engine Half-Cheetah Environment. To reach this objective, we use the computational efficiency of Augmented Random Search (ARS) to assess the agent's performance over several random episodes and hyper parameters. The study also showcases the effectiveness of the search approach used for this project via the examination of simulation data, including many events and agents' behaviors. Our simulation reveals that the commonly used metric for assessing the efficiency of RL learning is inadequate in accurately determining the efficacy of specific circumstances. In these instances, Augmented Random Search (ARS) surpasses other algorithms by achieving more rewards.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

Data sharing is not relevant to this paper as no datasets were produced or examined during the present investigation.

References

  1. Dhiman G, Kumar AV, Nirmalan R et al (2023) Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications. Multimed Tools Appl 82:5343–5367. https://doi.org/10.1007/s11042-022-12178-7

    Article  Google Scholar 

  2. Salimans T, Ho J, Chen X, Sidor S, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv: 1703.03864. https://doi.org/10.48550/arXiv.1703.03864

  3. Todorov E, Erez T, Tassa Y (2012) MuJoCo: A physics engine for model-based control. IEEE/RSJ Int Conf Intell Robots Syste, Vilamoura-Algarve, Portugal 5026–5033. https://doi.org/10.1109/IROS.2012.6386109

  4. Baba N (1981) Convergence of a random optimization method for constrained optimization problems. J Optim Theory Appl 33:451–461. https://doi.org/10.1007/BF00935752

    Article  MathSciNet  Google Scholar 

  5. Pattathil S, Zhang K, Ozdaglar A (2023) Symmetric (Optimistic) Natural Policy Gradient for Multi-Agent Learning with Parameter Convergence. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:5641–5685. https://proceedings.mlr.press/v206/pattathil23a.html

  6. Nagabandi A, Kahn G, Fearing RS, Levine S (2018) Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. IEEE International Conference on Robotics and Automation (ICRA) 7559–7566. https://doi.org/10.1109/ICRA.2018.8463189

  7. Hao C, Chen Y, Wu W et al (2023) Video object segmentation through semantic visual words matching. Multimed Tools Appl 82:19591–19605. https://doi.org/10.1007/s11042-023-14361-w

    Article  Google Scholar 

  8. Gupta S, Tyagi S, Kishor K (2022) Study and Development of Self Sanitizing Smart Elevator. In: Gupta D., Polkowski Z, Khanna A, Bhattacharyya S, Castillo O (eds) Proceedings of Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, 90. Springer, Singapore. https://doi.org/10.1007/978-981-16-6289-8_15

  9. Sharma A, Jha N, Kishor K (2022) Predict COVID-19 with Chest X-ray. In: Gupta D, Polkowski Z, Khanna A, Bhattacharyya S, Castillo O (eds) Proceedings of Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, 90. Springer, Singapore. https://doi.org/10.1007/978-981-16-6289-8_16

  10. Theodorou E, Buchli J, Schaal S (2010) Reinforcement learning of motor skills in high dimensions: A path integral approach. 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA. 2397–2403. https://doi.org/10.1109/ROBOT.2010.5509336

  11. Waheed SR, Rahim MSM, Suaib NM et al (2023) CNN deep learning-based image to vector depiction. Multimed Tools Appl 82:20283–20302. https://doi.org/10.1007/s11042-023-14434-w

    Article  Google Scholar 

  12. Marín-Lora C, Sotoca JM, Chover M (2022) Improved perception of ceramic molds through augmented reality. Multimed Tools Appl 81:43373–43390. https://doi.org/10.1007/s11042-022-13168-5

    Article  Google Scholar 

  13. Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) OpenAI Gym. https://doi.org/10.48550/arXiv.1606.01540

  14. Salimans T, Ho J, Chen X, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864 . https://doi.org/10.48550/arXiv.1703.03864

  15. Rajeswaran A, Lowrey K, Todorov E, Kakade S (2017) Towards generalization and simplicity in continuous control. arXiv preprint arXiv:1703.02660 . https://doi.org/10.48550/arXiv.1703.02660

  16. Xiang J, Li Q, Dong X, Ren Z (2019) Continuous Control with Deep Reinforcement Learning for Mobile Robot Navigation. 2019 Chinese Automation Congress (CAC), Hangzhou, China 1501–1506. https://doi.org/10.1109/CAC48633.2019.8996652

  17. Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International conference on machine learning. PMLR. https://doi.org/10.48550/arXiv.1801.01290

  18. Wu Y, Zhang Z, Qiu D et al (2023) Video driven adaptive grasp planning of virtual hand using deep reinforcement learning. Multimed Tools Appl 82:16301–16322. https://doi.org/10.1007/s11042-022-14190-3

    Article  Google Scholar 

  19. Levine S, Koltun V (2013) Guided Policy Search. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1–9. https://proceedings.mlr.press/v28/levine13.html

  20. Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2017) Deep reinforcement learning that matters. arXiv preprint arXiv:1709.06560 . https://doi.org/10.48550/arXiv.1709.06560

  21. Islam R, Henderson P, Gomrokchi M, Precup D (2017) Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. arXiv preprint arXiv:1708.04133 . https://doi.org/10.48550/arXiv.1708.04133

  22. Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. Int Conf Machine Learn 1928–1937. https://doi.org/10.48550/arXiv.1602.01783

  23. Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 . https://doi.org/10.48550/arXiv.1506.02438

  24. Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust Region Policy Optimization. Proc 32nd Int Conf Machine Learn Proc Machine Learn Res 37: 1889–1897. https://proceedings.mlr.press/v37/schulman15.html

  25. Gallardo C et al. (2018) Augmented Reality as a New Marketing Strategy. In: De Paolis L, Bourdot P (eds) Augmented reality, virtual reality, and computer graphics. AVR 2018. Lecture Notes in Comput Sci 10850. Springer, Cham. https://doi.org/10.1007/978-3-319-95270-3_29

  26. Carmigniani J, Furht B, Anisetti M et al (2011) Augmented reality technologies, systems and applications. Multimed Tools Appl 51:341–377. https://doi.org/10.1007/s11042-010-0660-6

    Article  Google Scholar 

  27. Linowes J, Babilinski K (2017) Augmented reality for developers: Build practical augmented reality applications with Unity. ARKit, and Vuforia. Packt Publishing Ltd, ARCore

    Google Scholar 

  28. Elmqaddem N (2019) Augmented reality and virtual reality in education myth or reality? Int J Emerg Technol Learn (iJET) 14(03):234. https://doi.org/10.3991/ijet.v14i03.928910.3991/ijet.v14i03.9289

    Article  Google Scholar 

  29. Kavanagh S, Luxton-Reilly A, Wuensche B, Plimmer B (2017) A systematic review of virtual reality in education. Themes Sci Technol Educ 10(2):85–119

    Google Scholar 

  30. Vellingiri S, Prabhakaran B (2018) Quantifying group navigation experience in collaborative augmented virtuality tours. In: Proceedings of the 3rd international workshop on multimedia alternate realities. https://doi.org/10.1145/3268998.3269002

  31. Kishor K (2022) Personalized Federated Learning. In: Yadav SP, Bhati BS, Mahato DP, Kumar S (eds) Federated Learning for IoT Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-85559-8_3

  32. Artut S (2015) Augmented Sculptures: What You See is not What You See. In: Brooks A, Ayiter E, Yazicigil O (eds) Arts and Technology. ArtsIT 2014. Lecture Notes of the Institute for Computer Sciences, Social Inform Telecommun Eng 145. Springer. https://doi.org/10.1007/978-3-319-18836-2_17

  33. Abbasi-Asl R, Keshavarzi M, Chan DY (2019) Brain-computer interface in virtual reality. In: International IEEE/EMBS Conference on Neural Engineering, NER. IEEE Computer Soc 1220–1224. https://doi.org/10.1109/NER.2019.8717158

  34. Min S, Kim HY (2022) The application of augmented reality technology in apparel design: A case of “Plaid Waltz”. In: Lee YA (eds) Leading Edge Technologies in Fashion Innovation. Palgrave Studies in Practice: Global Fashion Brand Management . Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-91135-5_7

  35. Mania H, Guy A, Recht B (2018) Simple random search provides a competitive approach to reinforcement learning. arXiv preprint arXiv: 1803.07055. https://doi.org/10.48550/arXiv.1803.07055

  36. Mnih V et al. (2016) Asynchronous methods for deep reinforcement learning. Int Conf Machine Learn. PMLR. https://doi.org/10.48550/arXiv.1602.01783

  37. Vellingiri S, McMahan RP, Johnson V et al (2023) An augmented virtuality system facilitating learning through nature walk. Multimed Tools Appl 82:1553–1564. https://doi.org/10.1007/s11042-022-13379-w

    Article  Google Scholar 

  38. Peter H, Islam R, Bachman P, Pineau J, Precup D, Meger D (2018) Deep reinforcement learning that matters." Proceedings of the AAAI conference on artificial intelligence 32:1. https://doi.org/10.1609/aaai.v32i1.11694

  39. Yi X, Zhang S, Yang T, Johansson KH (2022) Zeroth-order algorithms for stochastic distributed nonconvex optimization. Automatica 142:110353. https://doi.org/10.1016/j.automatica.2022.110353

    Article  MathSciNet  Google Scholar 

  40. Ruan X-X, Wu C-C (2022) Boost the performance of model training with the ray framework for emerging AI applications, 2022 IET international conference on engineering technologies and applications (IET-ICETA), Changhua, Taiwan 1–2. https://doi.org/10.1109/IET-ICETA56553.2022.9971626

  41. Wu Y et al. (2017) Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. Adv Neural Inf Process Syst 30: 5279–5288. arXiv:1809.00403v2 [cs.LG] 5 Sep 2018

  42. Tu S, Recht B (2018) Least-squares temporal difference learning for the linear quadratic regulator. Proceedings of the 35th international conference on machine learning, in proceedings of machine learning research 80:5005–5014. https://proceedings.mlr.press/v80/tu18a.html

  43. von Rueden L et al (2023) Informed machine learning – a taxonomy and survey of integrating prior knowledge into learning systems. IEEE Trans Know Data Eng 35(1):614–633. https://doi.org/10.1109/TKDE.2021.3079836

    Article  Google Scholar 

  44. Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv: 1506.02438. https://doi.org/10.48550/arXiv.1506.02438

  45. Silver D, Huang A, Maddison C et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489. https://doi.org/10.1038/nature16961

    Article  Google Scholar 

  46. Bedoya JC, Wang Y, Liu C-C (2021) Distribution system resilience under asynchronous information using deep reinforcement learning. IEEE Trans Power Syst 36(5):4235–4245. https://doi.org/10.1109/TPWRS.2021.3056543

    Article  Google Scholar 

  47. Ribeiro RP, Moniz N (2020) Imbalanced regression and extreme value prediction. Mach Learn 109:1803–1835. https://doi.org/10.1007/s10994-020-05900-9

    Article  MathSciNet  Google Scholar 

  48. Ashraf NM, Mostafa RR, Sakr RH, Rashad MZ (2021) Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm. PLoS ONE 16(6):e0252754. https://doi.org/10.1371/journal.pone.0252754

    Article  Google Scholar 

  49. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 . https://doi.org/10.48550/arXiv.1707.06347

  50. Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 . https://doi.org/10.48550/arXiv.1801.01290

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kaushal Kishor.

Ethics declarations

Conflicts of interests/Competing interests

I hereby affirm that this piece of work is entirely authentic and has not been previously published or is now being reviewed for publication anywhere.

I do not have any conflicts of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kishor, K. Using a half cheetah habitat for random augmentation computing. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19084-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-19084-0

Keywords

Navigation