当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep reinforcement learning approach with hybrid action space for mobile charging in wireless rechargeable sensor networks
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2024-03-20 , DOI: 10.1016/j.eswa.2024.123752
Chengpeng Jiang , Wencong Chen , Xingcan Chen , Sen Zhang , Wendong Xiao

Mobile charging is feasible to deal with the energy-constrained problem in wireless rechargeable sensor networks (WRSNs). The mobile chargers (MCs) are usually employed to charge the sensors sequentially according to the charging schemes. Existing studies assume that each sensor should be charged to its maximum energy capacity or to a fixed upper threshold before the next one can be charged. However, they neglect to control the charging time adaptively of each sensor according to the charging demand. Hence, in this paper, we assume that the charging time of each sensor can be controlled, and we study the joint optimization of charging sequence and charging time problem (JCSCT). Correspondingly, we propose a novel deep reinforcement learning with hybrid action space approach for JCSCT (DRLH-JCSCT), which utilizes deep q-network (DQN) to generate the charging sequence, and adopts deep deterministic policy gradient (DDPG) to determine the charging time. An attention-based encoder–decoder model is integrated in the actor network of DDPG, and a modified bi-directional gate recurrent unit network (MBGRU) is utilized as the decoder. We also design a novel reward function to evaluate the quality of the charging actions. Simulations demonstrate the improved charging performance of the proposed approach, with a longer network lifetime and fewer failed sensors compared with the existing mobile charging scheduling approaches.

中文翻译:

具有混合动作空间的深度强化学习方法,用于无线可充电传感器网络中的移动充电

移动充电可以解决无线可充电传感器网络(WRSN)中的能量受限问题。通常采用移动充电器(MC)根据充电方案对传感器进行顺序充电。现有研究假设每个传感器应充电至其最大能量容量或固定的上限阈值,然后才能为下一个传感器充电。然而,他们忽略了根据充电需求自适应地控制每个传感器的充电时间。因此,在本文中,我们假设每个传感器的充电时间是可以控制的,并且我们研究了充电序列和充电时间问题的联合优化(JCSCT)。相应地,我们提出了一种用于 JCSCT 的新型深度强化学习混合动作空间方法(DRLH-JCSCT),它利用深度 q 网络(DQN)生成收费序列,并采用深度确定性策略梯度(DDPG)来确定收费时间。基于注意力的编码器-解码器模型集成在 DDPG 的执行者网络中,并采用改进的双向门循环单元网络(MBGRU)作为解码器。我们还设计了一种新颖的奖励函数来评估充电动作的质量。仿真表明,与现有的移动充电调度方法相比,所提出的方法提高了充电性能,具有更长的网络寿命和更少的故障传感器。
更新日期:2024-03-20
down
wechat
bug