
浏览全部资源
扫码关注微信
杭州电子科技大学 自动化学院, 浙江 杭州 310018,2. 中国航天科技体系与创新研究院 算法中心,北京,100088
Received:13 October 2025,
Online First:13 February 2026,
移动端阅览
郭焱斌,方峰,王振亚,等. 基于序列到序列深度神经网络的WTA优化求解方法[J/OL]. 兵工学报, 2026(2026-02-16). https://doi.org/10.12382/bgxb.2025.0915.
GUO Y B, FANG F, ZUO Y, et al. A sequence-to-sequence deep neural network-based optimization method for the weapon-target assignment problem[J/OL]. Acta Armamentarii, 2026(2026-02-16). https://doi.org/10.12382/bgxb.2025.0915. (in Chinese)
郭焱斌,方峰,王振亚,等. 基于序列到序列深度神经网络的WTA优化求解方法[J/OL]. 兵工学报, 2026(2026-02-16). https://doi.org/10.12382/bgxb.2025.0915. DOI:
GUO Y B, FANG F, ZUO Y, et al. A sequence-to-sequence deep neural network-based optimization method for the weapon-target assignment problem[J/OL]. Acta Armamentarii, 2026(2026-02-16). https://doi.org/10.12382/bgxb.2025.0915. (in Chinese) DOI:
考虑大规模武器-目标分配(Weapon-Target Assignment
WTA)面临决策空间复杂度高、求解效率低的问题,综合应用指针网络、Actor-Critic策略梯度更新机制、预训练模型在线学习微调等方法,提出了一种改进序列到序列深度神经网络的WTA优化求解方法,提升了优化求解的质量、速度和一定泛化性。将WTA优化决策构建为武器-目标信息序列到武器-目标打击配对序列的映射问题,建立了基于长短期记忆网络为编码器和解码器的指针网络模型,引入交叉注意力机制实现目标与武器的动态匹配,降低输入次序变化对决策性能的影响。采用Actor-Critic方式更新模型梯度加速网络学习速度,减小REINFORCE策略梯度更新的方差。建立了离线预训练模型与在线微调相结合的两阶段优化机制,提升了多样化场景下的决策泛化性。仿真结果表明,本文算法相较于传统启发式算法,具有可兼顾求解质量与求解速度的优点;相较于DQN为代表的深度强化学习优化方法,具有更好的泛化性和鲁棒性。
Considering the challenges of large-scale Weapon–Target Assignment (WTA)
including the high complexity of the decision space and the low efficiency of traditional solution methods
this work proposes an improved sequence-to-sequence deep neural network approach that integrates pointer networks
an Actor–Critic policy gradient update mechanism
and online fine-tuning of pre-trained models. The proposed method enhances solution quality
computational efficiency
and generalization capability.The WTA optimization problem is formulated as a mapping from weapon–target information sequences to weapon–target engagement assignment sequences. A pointer network model is constructed using Long Short-Term Memory (LSTM) networks as both the encoder and decoder
and a cross-attention mechanism is incorporated to dynamically model weapon–target interactions
thereby reducing the sensitivity of decision performance to variations in input ordering. The Actor–Critic framework is employed to update model gradients
accelerating training and mitigating the variance inherent in REINFORCE-based policy gradient updates. Furthermore
a two-stage optimization strategy that combines offline pre-training with online fine-tuning is developed to improve the algorithm’s generalization across diverse operational scenarios.Simulation results show that
compared with traditional heuristic algorithms
the proposed approach achieves a superior balance between solution quality and computational speed. In addition
it demonstrates stronger generalization and robustness than deep reinforcement learning approaches such as DQN.
李梦杰,常雪凝,石建迈,等.武器目标分配问题研究进展:模型、算法与应用[J].系统工程与电子技术, 2023, 45(04): 1049-1071.
LI M J, CHANG X N, SHI J M, et al. Developments of weapon target assignment: models, algorithms, and applications[J]. Systems Engineering and Electronics, 2023, 45(04): 1049-1071. (in Chinese)
LLOYD S P, WITSENHAUSEN H S. Weapons allocation is NP-complete[C]//Proceedings of the 1986 Summer Computer Simulation Conference. [S.l.]: [S.n.], 1986: 1054-1058.
KLINE A G, AHNER D K, LUNDAY B J. Real-time heuristic algorithms for the static weapon target assignment problem[J]. Journal of Heuristics, 2019, 25(3): 377-397.
AHNER D K, PARSON C R. Optimal multi-stage allocation of weapons to targets using adaptive dynamic programming[J]. Optimization Letters, 2015, 9(8): 1689-1701.
王净,战凯,晏峰.基于动态规划算法的舰空导弹火力分配模型研究[J].舰船电子工程, 2011, 31(02): 24-26.
WANG J, ZHAN K, YAN F. Ship-to-air missile firepower-distributing model study based on dynamic programming algorithm[J]. Ship Electronic Engineering, 2011,31(02):24-26. (in Chinese)
张进,郭浩,陈统.基于可适应匈牙利算法的武器-目标分配问题[J].兵工学报, 2021, 42(6): 1339-1344.
ZHANG J, GUO H, CHENG T. Weapon-target assignment based on adaptable hungarian algorithm[J]. Acta Armamentarii, 2021, 42(6): 1339-1344. (in Chinese)
CHI Z, CHUN Z. Research on Networked Cooperative fire control based on improved Hungary algorithm[C]//2021 international conference on control science and electric power systems (CSEPS). [S.l.]: IEEE, 2021: 137-141.
JOHANSSON F, FALKMAN G. An empirical investigation of the static weapon-target allocation problem[C]// Proceedings of the 3rd Skövde Workshop on Information Fusion Topics. Skövde, Sweden: University of Skövde, 2009: 63-67.
KONG L, WANG J, ZHAO P. Solving the dynamic weapon target assignment problem by an improved multi-objective particle swarm optimization algorithm[J]. Applied sciences, 2021, 11(19): 9254.
侯鹏,葛玉雪,裴扬,等.基于毁伤评估结果的无人机对地攻击任务分配方法[J].兵工学报, 2025, 46(2): 240212.
HOU P, GE Y X, PEI Y, et al. UAV air-to-ground attack task assignment method based on damage assessment results[J]. Acta Armamentarii, 2025, 46(2): 240212. (in Chinese)
PAN W, DONG W, HUANG F, et al. Research on aircraft firepower distribution problem based on improved chaotic adaptive genetic algorithm[C]//International Conference on Guidance, Navigation and Control. Singapore: Springer Nature Singapore, 2024: 20-26.
ZHAO X, TANG R T, YUAN X J, et al. Research on numerical calculation methods for modeling multi-target detection and firepower allocation for multiple missile types against composite targets[C]//Proceedings of the 2024 International Conference on Applied Mathematics and Digital Simulation (AMDS 2024). Wuhan, China: IOP Publishing, 2025, 3004(1): 012010.
TONG L, YANG J, GAN X, et al. Simulation research on multi-aircraft conflict resolution based on improved chaotic ant colony algorithm[J]. Journal of System Simulation, 2025, 37(1): 155-166.
CHEN L, LIU W L, ZHONG J. An efficient multi-objective ant colony optimization for task allocation of heterogeneous unmanned aerial vehicles[J]. Journal of Computational Science, 2022, 58: 101545.
常雪凝,石建迈,陈超,等.基于匈牙利-模拟退火算法的多阶段武器目标分配方法[J].系统工程与电子技术, 2023, 45(11): 3516-3523.
CHANG X N, SHI J M, CHEN C, et al. Multi-stage weapon target assignment method based on Hungarian simulated annealing algorithms[J]. Systems Engineering and Electronics, 2023, 45(11): 3516-3523. (in Chinese)
LIN H. Research on distribution network planning and allocation optimization method based on genetic taboo hybrid algorithms[C]// Proceedings of the 4th International Conference on Energy Utilization and Automation (ICEUA 2025). Beijing, China: 2025, 2993: 012076.
宫华,张勇,许可,等.改进遗传算法的地对空防御武器系统多目标优化[J].兵器装备工程学报, 2022, 43(7): 87-95.
GONG H, ZHANG Y, XU K, et al. Multi-objective optimization of surface-to-air defense weapon system based on improved genetic algorithm[J]. Journal of Ordnance Equipment Engineering, 2022, 43(7): 87-95. (in Chinese)
YANG Y, MA Y, ZHAO Y, et al. A dynamic multi-objective evolutionary algorithm based on genetic engineering and improved particle swarm prediction strategy[J]. Information science, 2024, 660(000): 23
谢俊伟,方峰,彭冬亮,等.融合多属性决策和深度Q值网络的反导火力分配方法[J].电子与信息学报, 2022, 44(11): 3833-3841.
XIE J W, FANG F, PENG D L, et al. Weapon-target assignment optimization based on multi-attribute decision-making and deep q-network for missile defense system[J]. Journal of Electronics & Information Technology, 2022, 44(11): 3833-3841. (in Chinese)
HU T, ZHANG X, LUO X, et al. Dynamic target assignment by unmanned surface vehicles based on reinforcement learning [J]. Mathematics, 2024, 12(16): 2557.
WANG X, ZHANG Y, WANG G. Target assignment for multiple stages of weapons systems using a deep Q-learning network and a modified artificial bee colony method[J]. Computers and Electrical Engineering, 2024, 118: 109378.
肖友刚,金升成,毛晓,等.基于深度强化学习的舰船导弹目标分配方法[J],控制理论与应用, 2024, 41(6): 1-10.
XIAO Y G, JIN S C, MAO X, et al. Missile-target assignment method of naval ship based on deep reinforcement learning [J]. Control Theory & Applications, 2024, 41(6): 1-10. (in Chinese)
邱少明,刘良玉,黄昕晨,等.基于改进PSO-DQN的动态火力分配算法[J].电光与控制,2025,32(06):24-30.
QIU S M, LIU L Y, HUANG X C, et al. A dynamic weapon-target assignment algorithm based on improved PSO-DQN[J]. Electronics Optics & Control. 2025,32(06): 24-30. (in Chinese)
Peng Z , Lu Z , Mao X ,et al. Multi-ship dynamic weapon-target assignment via cooperative distributional reinforcement learning with dynamic reward[J].IEEE Transactions on Emerging Topics in Computational Intelligence, 2025(2):9.
郭兴海,张之倩,余乐安,等.基于异构无人系统海上多目标任务规划方法研究[J].系统工程理论与实践,2025,45(05):1687-1700.
GUO X H, ZHANG Z Q, YU L A, et al. Research on maritime multi-objective mission planning method based on heterogeneous unmanned system[J]. Systems Engineering — Theory & Practice. 2025,45(05): 1687-1700. (in Chinese)
VINYALS O, FORTUNATO M, JAITLY N. Pointer networks[J]. Advances in neural information processing systems, 2015, 28: 2692–2700
0
Views
0
下载量
0
CNKI被引量
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024360号