欢迎访问《兵工学报》官方网站,今天是

兵工学报

• •    下一篇

一种元学习和强化学习结合的多飞行器协同制导律

王存灿,王晓芳*,林海   

  1. 北京理工大学 宇航学院, 北京 100081
  • 收稿日期:2024-07-10 修回日期:2024-12-15

A Cooperative Guidance Law for Multi-aircraft Combining Meta-Learning and Reinforcement Learning

WANG Cuncan,WANG Xiaofang*, LIN Hai   

  1. School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China
  • Received:2024-07-10 Revised:2024-12-15

摘要: 针对高超声速再入滑翔飞行器在复杂环境中以指定角度同时命中目标的协同制导问题,提出一种基于元学习和强化学习算法的协同制导律。考虑复杂作战环境的干扰,建立协同制导问题的马尔可夫决策模型,以飞行器运动状态和比例导引系数作为状态空间和动作空间,综合考虑多飞行器攻击目标的相对距离、剩余飞行时间差以及过载情况设计奖励函数。基于元学习理论和强化学习算法将近端策略优化算法与门控循环单元相结合,通过学习相似协同制导任务的共同特征,提高协同制导策略在复杂干扰环境下的命中精度,实现攻击角度和攻击时间约束,同时提升协同制导策略对不同作战场景的适应性。仿真结果表明:该协同制导律能够在复杂战场环境下实现多飞行器以指定攻击角度对目标的同时攻击,并快速适应新的协同制导任务,在协同作战场景发生变化时仍能保持良好性能。

关键词: 高超声速再入滑翔飞行器, 协同制导, 元学习, 强化学习, 近端策略优化

Abstract: For the cooperative guidance problem of high-hypersonic re-entry gliding vehicles to hit targets at a specified angle in a complex environment, a cooperative guidance law based on meta-learning and reinforcement learning algorithms is proposed. Considering the interference of complex combat environments, a Markov decision model for the cooperative guidance problem is established, taking gliding vehicles’ motion status and proportional guidance factor as the state space and action space. A reward function is designed by comprehensively considering the vehicle-target distance, remaining flight time difference, and overload situation for multiple gliding vehicles attacking the target. Based on meta-learning theory and reinforcement learning algorithms, this study combines proximal policy optimization algorithms with gated recurrent units to learn the common features of similar cooperative guidance tasks. This approach enhances the accuracy of cooperative guidance strategies in complex interference environments, achieving constraints on attack angles and attack timing, while also improving the adaptability of cooperative guidance strategies to different combat scenarios. Simulation results indicate that this cooperative guidance law enables multiple aircraft to simultaneously attack targets at specified attack angles in complex battlefield environments and quickly adapts to new cooperative guidance tasks, maintaining good performance even when the cooperative combat scenario changes.

Key words: hypersonic re-entry gliding vehicle, cooperative guidance, meta learning, reinforcement learning, proximal policy optimization

中图分类号: