当前位置: X-MOL 学术Comput. Electr. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep hierarchical reinforcement learning for collaborative object transportation by heterogeneous agents
Computers & Electrical Engineering ( IF 4.3 ) Pub Date : 2024-01-23 , DOI: 10.1016/j.compeleceng.2023.109066
Maram Hasan , Rajdeep Niyogi

In the logistics and supply chain domain, coordinated efforts among agents play a pivotal role, particularly in the context of collaborative object transportation within a warehouse. This paper addresses the multifaceted challenge of multi-agent coordination in warehouse environments characterized by sparse reward structures, where the ability to communicate among agents may be limited or infeasible. Due to various constraints such as power limitations, weight capacity, or specialized abilities, the individual execution of this task by a single agent remains unattainable. Our study focuses on heterogeneous agents, where each agent possesses a distinct subset of skills and capabilities. Our research examines the emergence of cooperative behavior among groups of agents with the requisite skill sets, aiming to accomplish the task without explicit inter-agent communication or prior coordination. To encourage implicit agent coordination, we introduce a hierarchical approach integrating a global evaluation of abstract actions with curiosity-driven intrinsic learning. This approach is well-suited for real-world settings with scarce rewards. We evaluated its effectiveness in a warehouse domain, and the results show that our approach consistently achieves higher average returns, faster convergence, and improved exploration efficiency, highlighting its effectiveness in diverse scenarios.



中文翻译:

异构代理协作对象运输的深度分层强化学习

在物流和供应链领域,代理之间的协调工作发挥着关键作用,特别是在仓库内协作对象运输的背景下。本文解决了奖励结构稀疏的仓库环境中多智能体协调的多方面挑战,其中智能体之间的通信能力可能有限或不可行。由于功率限制、重量能力或专业能力等各种限制,单个智能体单独执行此任务仍然无法实现。我们的研究重点是异构代理,其中每个代理都拥有不同的技能和能力子集。我们的研究考察了具有必要技能的智能体群体之间合作行为的出现,旨在在没有明确的智能体间通信或事先协调的情况下完成任务。为了鼓励隐式代理协调,我们引入了一种分层方法,将对抽象行为的全局评估与好奇心驱动的内在学习相结合。这种方法非常适合奖励稀缺的现实环境。我们评估了其在仓库领域的有效性,结果表明我们的方法持续实现了更高的平均回报、更快的收敛速度和更高的探索效率,凸显了其在不同场景中的有效性。

更新日期:2024-01-25
down
wechat
bug