北京理工大学 空天科学与技术学院, 北京 100081
*wangxf@bit.edu.cn
收稿:2024-07-10,
网络出版:2025-08-12,
纸质出版:2025-07-31
移动端阅览
王存灿, 王晓芳, 林海. 一种元学习和强化学习结合的多飞行器协同制导律[J]. 兵工学报, 2025,46(7):240568.
Cuncan WANG, Xiaofang WANG, Hai LIN. A Cooperative Guidance Law Based on Meta-learning and Reinforcement Learning for Multiple Aerial Vehicles[J]. Acta Armamentarii, 2025, 46(7): 240568.
王存灿, 王晓芳, 林海. 一种元学习和强化学习结合的多飞行器协同制导律[J]. 兵工学报, 2025,46(7):240568. DOI: 10.12382/bgxb.2024.0568.
Cuncan WANG, Xiaofang WANG, Hai LIN. A Cooperative Guidance Law Based on Meta-learning and Reinforcement Learning for Multiple Aerial Vehicles[J]. Acta Armamentarii, 2025, 46(7): 240568. DOI: 10.12382/bgxb.2024.0568.
针对高超声速再入滑翔飞行器在复杂环境中以指定角度同时命中目标的协同制导问题
提出一种基于元学习和强化学习算法的协同制导律。考虑复杂作战环境的干扰
建立协同制导问题的马尔可夫决策模型
以飞行器运动状态和比例导引系数作为状态空间和动作空间
综合考虑多飞行器攻击目标的相对距离、剩余飞行时间差以及过载情况设计奖励函数。基于元学习理论和强化学习算法将近端策略优化算法与门控循环单元相结合
通过学习相似协同制导任务的共同特征
提高协同制导策略在复杂干扰环境下的命中精度
实现攻击角度和攻击时间约束
同时提升协同制导策略对不同作战场景的适应性。仿真结果表明:该协同制导律能够在复杂战场环境下实现多飞行器以指定攻击角度对目标的同时攻击
并快速适应新的协同制导任务
在协同作战场景发生变化时仍能保持良好性能。
For the cooperative guidance issue of high-hypersonic re-entry gliding vehicles to simultaneously hit a target at a specified angle in a complex environment
a cooperative guidance law based on meta-learning and reinforcement learning algorithms is proposed.Considering the interference caused by complex combat environment
a Markov decision model for the cooperative guidance issue is established
taking the gliding vehicles’ motion status and proportional guidance factor as the state space and action space.A reward function is designed by comprehensively considering the vehicle-target distance
remaining flight time difference
and overload situation for multiple gliding vehicles attacking a target.Based on meta-learning theory and reinforcement learning algorithm
the proximal policy optimization algorithms are combined with the gated recurrent units to learn the common features of similar cooperative guidance tasks.This approach enhances the accuracy of cooperative guidance strategies in complex interference environments to achieve the constraints on angle of attack and attack time
while also improving the adaptability of cooperative guidance strategy to different combat scenarios.Simulated results indicate that the proposed cooperative guidance law enables multiple aerial vehicles to simultaneously attack a target at a specified attack angle in complex battlefield environment and quickly adapt to new cooperative guidance tasks.The cooperative guidance law maintains good performance even when the cooperative combat scenario changes.
SZIROCZAK D , SMITH H . A review of design issues specific to hypersonic flight vehicles [J ] . Progress in Aerospace Sciences , 2016 ,84: 1 - 28 .
LEE C H , KIM T H , TANK M J . Interception angle control guidance using proportional navigation with error feedback [J ] . Journal of Guidance Control and Dynamics , 2013 , 36 ( 5 ): 1556 - 1561 .
黎克波 , 廖选平 , 梁彦刚 , 等 . 基于纯比例导引的拦截碰撞角约束制导策略 [J ] . 航空学报 , 2020 , 41 ( 增刊2 ): 724277 .
LI K B , LIAO X Q , LIANG Y G , et al. Guidance strategy with pure proportional guidance and intercept collision angle constraint [J ] . Acta Aeronautica et Astronautica Sinica , 2020 , 41 ( S2 ): 724277 . (in Chinese)
WANG Y N , WANG H , LIN D F , et al. Nonlinear modified bias proportional navigation guidance law against maneuvering targets [J ] . Journal of the Franklin Institute , 2022 , 359 ( 7 ): 2949 - 2975 .
KIM T H , PARK B G , TAHK M J . Bias-shaping method for biased proportional navigation with terminal-angle constraint [J ] . Journal of Guidance,Control,and Dynamics , 2013 , 36 ( 6 ): 1810 - 1816 .
CHEN X T , WANG J Z . Optimal control based guidance law to control both impact time and impact angle [J ] . Aerospace Science and Technology , 2019 ,84: 454 - 463 .
JEON I S , LEE J I , TAHK M J . Impact-time-control guidance law for anti-ship missiles [J ] . IEEE Transactions on Control Systems Technology , 2006 , 14 ( 2 ): 260 - 266 .
SALEEM A , RATNOO A . Lyapunov-based guidance law for impact time control and simultaneous arrival [J ] . Journal of Guidance,Control,and Dynamics , 2016 , 39 ( 1 ): 164 - 172 .
CHO D , KIM H J , TANK M J . Nonsingular sliding mode guidance for impact time control [J ] . Journal of Guidance Control and Dynamics , 2016 , 39 ( 1 ): 1 - 8 .
LI B F , LIN D , WANG H . Finite time convergence cooperative guidance law based on graph theory [J ] . Optik-International Journal for Light and Electron Optics , 2016 , 127 ( 21 ): 10180 - 10188 .
李国飞 , 汤清璞 , 吴云洁 . 从飞行器无导引头的主-从式多飞行器协同制导方法 [J ] . 兵工学报 , 2023 , 44 ( 11 ): 3436 - 3446 . DOI: 10.12382/bgxb.2023.0678 http://doi.org/10.12382/bgxb.2023.0678 针对主-从式多飞行器多方位封锁协同打击目标问题,提出三维空间下当从飞行器无导引头时的协同制导方法。主飞行器通过给出的固定时间收敛攻击时间控制制导律实现对目标的指定时间命中,以主飞行器为基准提出球心和半径可动态自调整的目标封锁球面构型函数。以期望封控球面为参考,给出可实现对目标多方位覆盖封锁的同时命中协同制导律,并理论分析该方法的稳定性。仿真结果表明,新提出的方法可实现飞行器群于指定命中时刻同时命中目标,且主飞行器和无导引头的从飞行器可构成对目标的封锁态势。
LI G F , TANG Q P , WU Y J . Cooperative guidance method of leader and seeker-less follower flight vehicles [J ] . Acta Armamentarii , 2023 , 44 ( 11 ): 3436 - 3446 . (in Chinese) DOI: 10.12382/bgxb.2023.0678 http://doi.org/10.12382/bgxb.2023.0678 A three-dimensional cooperative guidance method is presented to investigate the problem of leader and seeker-less follower flight vehicles attacking on the targets by sector coverage. An impact time control guidance law with fixed-time convergence is presented for the leader to reach a target at desired impact time. A sector coverage configuration against target with dynamic self-adjusting center and radius is proposed with the leader as baseline. Finally,on the basis of the desired converge sector, the cooperative guidance law is given for the followers to reach the target simultaneously with multi-directional sector blockade. A theoretical analysis is given to prove the stability of the proposed method. The simulated results show that the proposed method can be used to make all the flight vehicles hit the target at appointed impact time, and the leader and the seeker-less followers can form the coverage situation towards the target.
CHEN Y D , WANG J N , WANG C Y , et al. Three-dimensional cooperative homing guidance law with field-of-view constraint [J ] . Journal of Guidance,Control,and Dynamics , 2019 , 43 ( 5 ): 1 - 9 .
ZHANG Y A , WANG X L , MA G X . Impact time control guidance law with large impact angle constraint [J ] . Proceedings of the Institution of Mechanical Engineers , Part G:Journal of Aerospace Engineering , 2015 , 229 ( 11 ): 2119 - 2131 .
LI W , WEN Q Q , HE L , et al. Three-dimensional impact angle constrained distributed cooperative guidance law for anti-ship missiles [J ] . Journal of Systems Engineering and Electronics , 2021 , 32 ( 2 ): 447 - 459 . DOI: 10.23919/JSEE.2021.000038 http://doi.org/10.23919/JSEE.2021.000038 This paper investigates the problem of distributed cooperative guidance law design for multiple anti-ship missiles in the three-dimensional (3-D) space hitting simultaneously the same target with considering the desired terminal impact angle constraint. To address this issue, the problem formulation including 3-D nonlinear mathematical model description, and communication topology are built firstly. Then the consensus variable is constructed using the available information and can reach consensus under the proposed acceleration command along the line-of-sight (LOS) which satisfies the impact time constraint. However, the normal accelerations are designed to guarantee the convergence of the LOS angular rate. Furthermore, consider the terminal impact angle constraints, a nonsingular terminal sliding mode (NTSM) control is introduced, and a finite time convergent control law of normal acceleration is proposed. The convergence of the proposed guidance law is proved by using the second Lyapunov stability method, and numerical simulations are also conducted to verify its effectiveness. The results indicate that the proposed cooperative guidance law can regulate the impact time error and impact angle error in finite time if the connecting time of the communication topology is longer than the required convergent time.
GRANDO R B , DE J J C , KICH V A , et al. Double critic deep reinforcement learning for mapless 3D navigation of unmanned aerial vehicles [J ] . Journal of Intelligent & Robotic Systems , 2022 , 104 ( 2 ): 29 .
SUN B , KAMPEN V E J . Reinforcement-learning-based adaptive optimal flight control with output feedback and input constraints [J ] . Journal of Guidance,Control,and Dynamics , 2021 , 44 ( 9 ): 1685 - 1691 .
ZHANG J R , ZHANG K P , ZHANG Y , et al. Near-optimal interception strategy for orbital pursuit-evasion using deep reinforcement learning [J ] . Acta Astronautica , 2022 ,198: 9 - 25 .
HE X J , CHEN Z H , JIA F , et al. Guidance law based on zero effort miss and Q-learning algorithm [C ] //Proceeding of the 17th Symposium on Novel Phot oelectronic Detection Technology and Applications.Kunming, China:SPIE , 2021 ,11763: 708 - 716 .
陈中原 , 韦文书 , 陈万春 . 基于强化学习的多发导弹协同攻击智能制导律 [J ] . 兵工学报 , 2021 , 42 ( 8 ): 1638 - 1647 .
CHEN Z Y , WEI W S , CHEN W C . Intelligent guidance law for multi-missile coordinated attack based on reinforcement learning [J ] . Acta Armamentarii , 2021 , 42 ( 8 ): 1638 - 1647 . (in Chinese)
李博皓 , 安旭曼 , 杨晓飞 , 等 . 攻击角度约束下的分布式强化学习制导方法 [J ] . 宇航学报 , 2022 , 43 ( 8 ): 1061 - 1069 .
LI B H , AN X M , YANG X F , et al. Distributed reinforcement learning guidance method under attack angle constraint [J ] . Journal of Astronautics , 2022 , 43 ( 8 ): 1061 - 1069 (in Chinese).
WANG N , WANG X , CUI N , et al. Deep reinforcement learning-based impact time control guidance law with constraints on the field-of-view [J ] . Aerospace Science and Technology , 2022 ,128:107765.
刘旭 , 李响 , 王晓鹏 . 高超声速滑翔飞行器解析协同再入制导 [J ] . 宇航学报 , 2023 , 44 ( 5 ): 731 - 742 .
LIU X , LI X , WANG X P . Analytical cooperative re-entry guidance for hypersonic glide vehicles [J ] . Journal of Astronautics , 2023 , 44 ( 5 ): 731 - 742 . (in Chinese)
高峰 , 唐胜景 , 师娇 , 等 . 一种基于落角约束的偏置比例导引律 [J ] . 北京理工大学学报 , 2014 , 34 ( 3 ): 277 - 282 .
GAO F , TANG S J , SHI J , et al. A bias proportional navigation guidance law based on terminal impact angle constrain [J ] . Transactions of Beijing Institute of Technology , 2014 , 34 ( 3 ): 277 - 282 . (in Chinese)
李东旭 , 王晓芳 , 林海 . 多高超声速导弹协同末制导律及可行初始位置域研究 [J ] . 弹道学报 , 2019 , 31 ( 4 ): 1 - 7 . DOI: 10.12115/j.issn.1004-499X(2019)04-001 http://doi.org/10.12115/j.issn.1004-499X(2019)04-001 针对多高超声速导弹以指定落角对目标进行饱和攻击的问题,提出一种具有攻击角度和攻击时间约束的末制导律,并基于此对导弹的末制导可行初始位置域进行研究。建立三维空间导弹运动模型,分别通过纵向通道和侧向通道制导指令实现落角和攻击时间控制,构成三维协同末制导律。考虑导弹过载和末速约束,采用上述制导律,以导弹末制导初始位置的坐标为设计变量,以实际攻击时间与理想攻击时间差为性能指标函数,建立并求解优化模型,得到导弹一维及二维可行末制导初始位置域。仿真结果表明:该三维协同制导律可使多导弹在同一指定时间以指定落角命中目标; 该位置域求解方法可求得给定约束条件下的各导弹末制导可行位置域。
LI D X , WANG X F , LIN H . Research on cooperative terminal guidance law and feasibleinitial position domain for multi-hypersonic missiles [J ] . Journal of Ballistics , 2019 , 31 ( 4 ): 1 - 7 (in Chinese). DOI: 10.12115/j.issn.1004-499X(2019)04-001 http://doi.org/10.12115/j.issn.1004-499X(2019)04-001 Aiming at the problem that multi-hypersonic missiles can realize saturation attack to the target with a specified impact angle,a terminal guidance law with constraints of impact angle and impact time was proposed,and the feasible initial position domain of terminal guidance of missiles was studied. A relative motion model of missiles in the three-dimensional space was established. The vertical and lateral guidance command was used to realize the impact angle control and the impact time control respectively,and a three-dimensional cooperative terminal guidance law was formed. Adopting the law,the constraints of final velocity and available overload of missiles were considered,and the initial position of terminal guidance of missiles was taken as a design variable,and the difference between the actual impact time and the ideal impact time was taken as the performance index function,and an optimization model was established. The model was solved to obtain one-dimensional and two-dimensional feasible initial position domains of terminal guidance of missiles. Simulation results show that the three-dimensional cooperative guidance law can make multiple-missiles hit the target at the designated impact time with the required impact angle,and the position domain solving-method can be used to obtain the feasible initial position domains for each missile under given constraints.
0
浏览量
302
下载量
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024360号