欢迎访问《兵工学报》官方网站,今天是 分享到:

兵工学报 ›› 2023, Vol. 44 ›› Issue (11): 3516-3528.doi: 10.12382/bgxb.2022.1276

所属专题: 群体协同与自主技术

• • 上一篇    下一篇

基于强化学习的海上要地群协同防空动态火力分配

赵文飞1,*(), 陈健1, 王2, 滕克难1   

  1. 1 海军航空大学, 山东 烟台 264001
    2 91550部队, 辽宁 大连 116041
  • 收稿日期:2022-12-22 上线日期:2023-05-04
  • 通讯作者:
  • 基金资助:
    海军航空大学科研自主立项项目(H2202201002)

Dynamic Firepower Allocation for Cooperative Air Defense of Strategic Locations on the Sea Based on Reinforcement Learning

ZHAO Wenfei1,*(), CHEN Jian1, WANG Yan2, TENG Kenan1   

  1. 1 Naval Aviation University, Yantai 264001, Shandong, China
    2 Unit 91550 of PLA, Dalian 116041, Liaoning, China
  • Received:2022-12-22 Online:2023-05-04

摘要:

针对海上要地群协同防空作战动态火力分配问题,综合分析海上要地防空作战过程的特点,建立基于马尔可夫决策模型的动态火力分配问题,构建以海上要地毁伤期望、拦截成本为指标的优化模型。考虑到马尔可夫决策模型求解易陷入维数灾难的问题,提出利用近似动态规划方法来探究解的有效性,并给出基于强化学习的最小二乘时序差分算法来求解该问题。通过4种典型的攻防场景共80个案例仿真结果表明,相比传统的匹配算法、遗传算法和粒子群优化算法,新构建的模型和算法更加科学合理有效,可为海上要地群协同防空作战火力分配提供一定的理论依据。

关键词: 海上要地, 动态火力分配, 强化学习, 马尔可夫决策

Abstract:

For the dynamic firepower allocation in the cooperative air defense operation of strategic locations on the sea, the characteristics of air defense operations in strategic locations on the sea are comprehensively analyzed to establish the dynamic firepower allocation problem based on the Markov decision model, and an optimization model with the damage expectation and interception cost as the indexes is constructed. Considering the problem that the Markov decision model is easy to fall into the disaster of dimensionality, an approximate dynamic programming method is proposed to explore the validity of the solution, and a least squares temporal difference algorithm based on reinforcement learning is given to solve the problem. The simulated results of 80 cases in four typical offensive and defensive scenarios show that, compared with the traditional matching algorithm, genetic algorithm and particle swarm optimization algorithm, the proposed model and algorithmin this paper are more scientific, reasonable and effective, which can provide a certain basis for the firepower allocation in the cooperative air defense operations of strategic locations on the sea.

Key words: strategic location on the sea, dynamic firepower allocation, reinforcement learning, Markov decision

中图分类号: