当前位置: X-MOL 学术Isa Trans. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement learning-based attitude control for a barbell electric sail
ISA Transactions ( IF 7.3 ) Pub Date : 2024-02-24 , DOI: 10.1016/j.isatra.2024.02.019
Xiaolei Ma , Hao Wen

The electric solar wind sail (E-sail) is a new propellant-free propulsion concept. The under-actuated and highly nonlinear features of E-sail systems pose a great challenge to their attitude controller design. Conventional control schemes may not be capable of dealing with this tough problem. To this end, a reinforcement learning (RL)-based control scheme, which can explore and obtain optimal policies in the absence of training datasets, is proposed for the attitude control of a barbell E-sail system. The barbell E-sail comprises two end satellites linked to an insulated confluence point through long and conductive tethers. The voltages of the two tethers can be individually modulated for attitude control. The system attitude dynamics is described using a nonsingular formulation. The control scheme has a two-stage design. In the first stage, an RL controller based on the Proximal Policy Optimization (PPO) algorithm is used to obtain an RL control strategy, which is emulated and updated by neural networks. In the second stage, the attitude feedback control is accomplished with low computation and energy consumption and fast convergence speed by performing a real-time mapping from the system state to the control output using the updated control strategy. Finally, the simulation results demonstrate that the proposed RL-based control scheme can effectively adjust the E-sail to the design attitude by regulating the tether voltage difference. The comparisons with the NMPC scheme also indicate that the developed control scheme can significantly reduce the computation time with control accuracy maintained.

中文翻译:

基于强化学习的杠铃电帆姿态控制

电动太阳能风帆(E-sail)是一种新的无推进剂推进概念。电动帆系统的欠驱动和高度非线性的特点对其姿态控制器的设计提出了巨大的挑战。传统的控制方案可能无法解决这个难题。为此,提出了一种基于强化学习(RL)的控制方案,该方案可以在缺乏训练数据集的情况下探索并获得最优策略,用于杠铃电子帆系统的姿态控制。杠铃电子帆由两个末端卫星组成,通过长而导电的系绳连接到绝缘汇合点。两条系绳的电压可以单独调制以进行姿态控制。系统姿态动力学使用非奇异公式来描述。控制方案采用两级设计。在第一阶段,使用基于近端策略优化(PPO)算法的RL控制器来获得RL控制策略,并通过神经网络进行仿真和更新。在第二阶段,通过使用更新的控制策略执行从系统状态到控制输出的实时映射,以低计算量和能耗以及快速收敛速度完成姿态反馈控制。最后,仿真结果表明,所提出的基于强化学习的控制方案可以通过调节系绳电压差来有效地将电子帆调整到设计姿态。与NMPC方案的比较还表明,所开发的控制方案可以在保持控制精度的情况下显着减少计算时间。
更新日期:2024-02-24
down
wechat
bug