当前位置: X-MOL 学术IEEE Syst. J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data-Efficient Deep Reinforcement Learning-Based Optimal Generation Control in DC Microgrids
IEEE Systems Journal ( IF 4.4 ) Pub Date : 2024-02-05 , DOI: 10.1109/jsyst.2024.3355328
Zhen Fan 1 , Wei Zhang 2 , Wenxin Liu 3
Affiliation  

Because of their simplicity and great energy-utilizing efficiency, dc microgrids are gaining popularity as an attractive option for the optimal operation of numerous distributed energy resources. The optimal power flow issue's nonlinearity and nonconvexity make it difficult to apply and develop the conventional control approach directly. With the development of machine learning in recent years, deep reinforcement learning (DRL) has been developed for solving such complex optimal control problems. This article proposes a DRL-based TD3 optimal control scheme to achieve the optimal generation control for dc microgrids. The generation cost of distributed generators is minimized, and the significant boundaries, such as generation bounds and the bus voltage bounds, are both guaranteed. The proposed approach connects the optimal control and reinforcement learning frameworks with centralized training and distributed execution structure. Case studies showed that reinforcement learning algorithms might optimize nonlinear and nonconvex systems with fast dynamics by utilizing particular reward function designs, data sampling, and constraint management strategies. In addition, producing the experience replay buffer before training drastically lowers learning failure, enhancing the data efficiency of the DRL process.

中文翻译:

直流微电网中基于数据高效的深度强化学习的最优发电控制

由于其简单性和能源利用效率高,直流微电网作为众多分布式能源优化运行的有吸引力的选择而受到欢迎。最优潮流问题的非线性和非凸性使得传统的控制方法难以直接应用和发展。近年来,随着机器学习的发展,深度强化学习(DRL)被发展用来解决此类复杂的最优控制问题。本文提出一种基于DRL的TD3最优控制方案来实现直流微电网的最优发电控制。分布式发电机的发电成本被最小化,并且重要的边界,例如发电边界和母线电压边界,都得到保证。所提出的方法将最优控制和强化学习框架与集中式训练和分布式执行结构连接起来。案例研究表明,强化学习算法可以通过利用特定的奖励函数设计、数据采样和约束管理策略来优化快速动态的非线性和非凸系统。此外,在训练之前生成经验重放缓冲区可以大大降低学习失败率,从而提高 DRL 过程的数据效率。
更新日期:2024-02-05
down
wechat
bug