Abstract
Reinforcement learning algorithms that depend on physical models are thought to provide more substantial outcomes in comparison to model-free approaches when implemented on dynamic systems. Previous research has mostly focused on addressing self-management difficulties via the use of sophisticated neural network models. In order to get state-of-the-art outcomes, these models need a significant amount of training data. When using model-free policy search techniques, it is often considered that model-free reinforcement learning processes with policies that provide worse outcomes may be avoided by utilizing augmented random search (ARS), model-free. This technique improves the effectiveness and speeds up the training of linear methods for performing control tasks on the virtual physics engine Half-Cheetah Environment. To reach this objective, we use the computational efficiency of Augmented Random Search (ARS) to assess the agent's performance over several random episodes and hyper parameters. The study also showcases the effectiveness of the search approach used for this project via the examination of simulation data, including many events and agents' behaviors. Our simulation reveals that the commonly used metric for assessing the efficiency of RL learning is inadequate in accurately determining the efficacy of specific circumstances. In these instances, Augmented Random Search (ARS) surpasses other algorithms by achieving more rewards.
Similar content being viewed by others
Data availability
Data sharing is not relevant to this paper as no datasets were produced or examined during the present investigation.
References
Dhiman G, Kumar AV, Nirmalan R et al (2023) Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications. Multimed Tools Appl 82:5343–5367. https://doi.org/10.1007/s11042-022-12178-7
Salimans T, Ho J, Chen X, Sidor S, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv: 1703.03864. https://doi.org/10.48550/arXiv.1703.03864
Todorov E, Erez T, Tassa Y (2012) MuJoCo: A physics engine for model-based control. IEEE/RSJ Int Conf Intell Robots Syste, Vilamoura-Algarve, Portugal 5026–5033. https://doi.org/10.1109/IROS.2012.6386109
Baba N (1981) Convergence of a random optimization method for constrained optimization problems. J Optim Theory Appl 33:451–461. https://doi.org/10.1007/BF00935752
Pattathil S, Zhang K, Ozdaglar A (2023) Symmetric (Optimistic) Natural Policy Gradient for Multi-Agent Learning with Parameter Convergence. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:5641–5685. https://proceedings.mlr.press/v206/pattathil23a.html
Nagabandi A, Kahn G, Fearing RS, Levine S (2018) Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. IEEE International Conference on Robotics and Automation (ICRA) 7559–7566. https://doi.org/10.1109/ICRA.2018.8463189
Hao C, Chen Y, Wu W et al (2023) Video object segmentation through semantic visual words matching. Multimed Tools Appl 82:19591–19605. https://doi.org/10.1007/s11042-023-14361-w
Gupta S, Tyagi S, Kishor K (2022) Study and Development of Self Sanitizing Smart Elevator. In: Gupta D., Polkowski Z, Khanna A, Bhattacharyya S, Castillo O (eds) Proceedings of Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, 90. Springer, Singapore. https://doi.org/10.1007/978-981-16-6289-8_15
Sharma A, Jha N, Kishor K (2022) Predict COVID-19 with Chest X-ray. In: Gupta D, Polkowski Z, Khanna A, Bhattacharyya S, Castillo O (eds) Proceedings of Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, 90. Springer, Singapore. https://doi.org/10.1007/978-981-16-6289-8_16
Theodorou E, Buchli J, Schaal S (2010) Reinforcement learning of motor skills in high dimensions: A path integral approach. 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA. 2397–2403. https://doi.org/10.1109/ROBOT.2010.5509336
Waheed SR, Rahim MSM, Suaib NM et al (2023) CNN deep learning-based image to vector depiction. Multimed Tools Appl 82:20283–20302. https://doi.org/10.1007/s11042-023-14434-w
Marín-Lora C, Sotoca JM, Chover M (2022) Improved perception of ceramic molds through augmented reality. Multimed Tools Appl 81:43373–43390. https://doi.org/10.1007/s11042-022-13168-5
Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) OpenAI Gym. https://doi.org/10.48550/arXiv.1606.01540
Salimans T, Ho J, Chen X, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864 . https://doi.org/10.48550/arXiv.1703.03864
Rajeswaran A, Lowrey K, Todorov E, Kakade S (2017) Towards generalization and simplicity in continuous control. arXiv preprint arXiv:1703.02660 . https://doi.org/10.48550/arXiv.1703.02660
Xiang J, Li Q, Dong X, Ren Z (2019) Continuous Control with Deep Reinforcement Learning for Mobile Robot Navigation. 2019 Chinese Automation Congress (CAC), Hangzhou, China 1501–1506. https://doi.org/10.1109/CAC48633.2019.8996652
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. International conference on machine learning. PMLR. https://doi.org/10.48550/arXiv.1801.01290
Wu Y, Zhang Z, Qiu D et al (2023) Video driven adaptive grasp planning of virtual hand using deep reinforcement learning. Multimed Tools Appl 82:16301–16322. https://doi.org/10.1007/s11042-022-14190-3
Levine S, Koltun V (2013) Guided Policy Search. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1–9. https://proceedings.mlr.press/v28/levine13.html
Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2017) Deep reinforcement learning that matters. arXiv preprint arXiv:1709.06560 . https://doi.org/10.48550/arXiv.1709.06560
Islam R, Henderson P, Gomrokchi M, Precup D (2017) Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. arXiv preprint arXiv:1708.04133 . https://doi.org/10.48550/arXiv.1708.04133
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. Int Conf Machine Learn 1928–1937. https://doi.org/10.48550/arXiv.1602.01783
Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438 . https://doi.org/10.48550/arXiv.1506.02438
Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust Region Policy Optimization. Proc 32nd Int Conf Machine Learn Proc Machine Learn Res 37: 1889–1897. https://proceedings.mlr.press/v37/schulman15.html
Gallardo C et al. (2018) Augmented Reality as a New Marketing Strategy. In: De Paolis L, Bourdot P (eds) Augmented reality, virtual reality, and computer graphics. AVR 2018. Lecture Notes in Comput Sci 10850. Springer, Cham. https://doi.org/10.1007/978-3-319-95270-3_29
Carmigniani J, Furht B, Anisetti M et al (2011) Augmented reality technologies, systems and applications. Multimed Tools Appl 51:341–377. https://doi.org/10.1007/s11042-010-0660-6
Linowes J, Babilinski K (2017) Augmented reality for developers: Build practical augmented reality applications with Unity. ARKit, and Vuforia. Packt Publishing Ltd, ARCore
Elmqaddem N (2019) Augmented reality and virtual reality in education myth or reality? Int J Emerg Technol Learn (iJET) 14(03):234. https://doi.org/10.3991/ijet.v14i03.928910.3991/ijet.v14i03.9289
Kavanagh S, Luxton-Reilly A, Wuensche B, Plimmer B (2017) A systematic review of virtual reality in education. Themes Sci Technol Educ 10(2):85–119
Vellingiri S, Prabhakaran B (2018) Quantifying group navigation experience in collaborative augmented virtuality tours. In: Proceedings of the 3rd international workshop on multimedia alternate realities. https://doi.org/10.1145/3268998.3269002
Kishor K (2022) Personalized Federated Learning. In: Yadav SP, Bhati BS, Mahato DP, Kumar S (eds) Federated Learning for IoT Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-85559-8_3
Artut S (2015) Augmented Sculptures: What You See is not What You See. In: Brooks A, Ayiter E, Yazicigil O (eds) Arts and Technology. ArtsIT 2014. Lecture Notes of the Institute for Computer Sciences, Social Inform Telecommun Eng 145. Springer. https://doi.org/10.1007/978-3-319-18836-2_17
Abbasi-Asl R, Keshavarzi M, Chan DY (2019) Brain-computer interface in virtual reality. In: International IEEE/EMBS Conference on Neural Engineering, NER. IEEE Computer Soc 1220–1224. https://doi.org/10.1109/NER.2019.8717158
Min S, Kim HY (2022) The application of augmented reality technology in apparel design: A case of “Plaid Waltz”. In: Lee YA (eds) Leading Edge Technologies in Fashion Innovation. Palgrave Studies in Practice: Global Fashion Brand Management . Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-91135-5_7
Mania H, Guy A, Recht B (2018) Simple random search provides a competitive approach to reinforcement learning. arXiv preprint arXiv: 1803.07055. https://doi.org/10.48550/arXiv.1803.07055
Mnih V et al. (2016) Asynchronous methods for deep reinforcement learning. Int Conf Machine Learn. PMLR. https://doi.org/10.48550/arXiv.1602.01783
Vellingiri S, McMahan RP, Johnson V et al (2023) An augmented virtuality system facilitating learning through nature walk. Multimed Tools Appl 82:1553–1564. https://doi.org/10.1007/s11042-022-13379-w
Peter H, Islam R, Bachman P, Pineau J, Precup D, Meger D (2018) Deep reinforcement learning that matters." Proceedings of the AAAI conference on artificial intelligence 32:1. https://doi.org/10.1609/aaai.v32i1.11694
Yi X, Zhang S, Yang T, Johansson KH (2022) Zeroth-order algorithms for stochastic distributed nonconvex optimization. Automatica 142:110353. https://doi.org/10.1016/j.automatica.2022.110353
Ruan X-X, Wu C-C (2022) Boost the performance of model training with the ray framework for emerging AI applications, 2022 IET international conference on engineering technologies and applications (IET-ICETA), Changhua, Taiwan 1–2. https://doi.org/10.1109/IET-ICETA56553.2022.9971626
Wu Y et al. (2017) Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. Adv Neural Inf Process Syst 30: 5279–5288. arXiv:1809.00403v2 [cs.LG] 5 Sep 2018
Tu S, Recht B (2018) Least-squares temporal difference learning for the linear quadratic regulator. Proceedings of the 35th international conference on machine learning, in proceedings of machine learning research 80:5005–5014. https://proceedings.mlr.press/v80/tu18a.html
von Rueden L et al (2023) Informed machine learning – a taxonomy and survey of integrating prior knowledge into learning systems. IEEE Trans Know Data Eng 35(1):614–633. https://doi.org/10.1109/TKDE.2021.3079836
Schulman J, Moritz P, Levine S, Jordan M, Abbeel P (2015) High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv: 1506.02438. https://doi.org/10.48550/arXiv.1506.02438
Silver D, Huang A, Maddison C et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489. https://doi.org/10.1038/nature16961
Bedoya JC, Wang Y, Liu C-C (2021) Distribution system resilience under asynchronous information using deep reinforcement learning. IEEE Trans Power Syst 36(5):4235–4245. https://doi.org/10.1109/TPWRS.2021.3056543
Ribeiro RP, Moniz N (2020) Imbalanced regression and extreme value prediction. Mach Learn 109:1803–1835. https://doi.org/10.1007/s10994-020-05900-9
Ashraf NM, Mostafa RR, Sakr RH, Rashad MZ (2021) Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm. PLoS ONE 16(6):e0252754. https://doi.org/10.1371/journal.pone.0252754
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 . https://doi.org/10.48550/arXiv.1707.06347
Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 . https://doi.org/10.48550/arXiv.1801.01290
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interests/Competing interests
I hereby affirm that this piece of work is entirely authentic and has not been previously published or is now being reviewed for publication anywhere.
I do not have any conflicts of interest to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kishor, K. Using a half cheetah habitat for random augmentation computing. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19084-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-19084-0