欢迎访问《兵工学报》官方网站,今天是

兵工学报 ›› 2025, Vol. 46 ›› Issue (2): 240222-.doi: 10.12382/bgxb.2024.0222

• • 上一篇    下一篇

基于强化学习的高超声速滑翔飞行器自适应末制导

肖柳骏, 李雅轩, 刘新福*()   

  1. 北京理工大学 宇航学院, 北京 100081

Adaptive Terminal Guidance for Hypersonic Gliding Vehicles Using Reinforcement Learning

XIAO Liujun, LI Yaxuan, LIU Xinfu*()   

  1. School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China
  • Received:2024-03-28 Online:2025-02-28

摘要:

针对高超声速滑翔飞行器末制导段存在的动力学模型参数不确定性,以及传统强化学习算法收敛速度慢的问题,提出一种基于强化学习的自适应制导方法。将标称条件下的高超声速滑翔飞行器末制导问题转化为最优控制问题,并根据序列凸优化算法进行求解得到状态-控制对的数据集:基于监督学习对数据集进行拟合,得到相应的神经网络制导模型:引入气动参数偏差、控制响应延迟系数不确定性以及状态测量噪声等干扰,通过飞行器与当前环境的大量交互,基于强化学习进一步优化神经网络制导模型。数值仿真结果表明,新提出的制导方法与监督学习制导方法相比具有更好的鲁棒性与精确性。

关键词: 高超声速滑翔飞行器, 监督学习, 强化学习, 自适应末制导

Abstract:

Addressing the uncertainty of dynamic model parameters in the terminal guidance phase of hypersonic gliding vehicles and the slow convergence speed of traditional reinforcement learning algorithm,an adaptive guidance algorithm based on reinforcement learning is proposed.The terminal guidance problem for hypersonic gliding vehicles under nominal conditions is converted into an optimal control problem,which is solved using the sequential convex optimization algorithm to generate a dataset of state-control pairs.The dataset is fitted through supervised learning to obtain a corresponding guidance model.The disturbances such as aerodynamic parameter deviation,uncertainty in control response delay coefficient,and state measurement noise are introduced,and the guidance model is further optimized based on the reinforcement learning framework through numerous interactions between the vehicle and the current environment.Numerically simulated results indicate that the proposed guidance method exhibits better robustness and accuracy compared to the supervised learning guidance method.

Key words: hypersonic gliding vehicle, supervised learning, reinforcement learning, adaptive terminal guidance

中图分类号: