Welcome to Acta Armamentarii ! Today is Share:

Acta Armamentarii ›› 2023, Vol. 44 ›› Issue (6): 1547-1563.doi: 10.12382/bgxb.2022.0711

Previous Articles     Next Articles

Multi-Dimensional Decision-Making for UAV Air Combat Based on Hierarchical Reinforcement Learning

ZHANG Jiandong1, WANG Dinghan1, YANG Qiming1,*(), SHI Guoqing1, LU Yi2, ZHANG Yaozhong1   

  1. 1. School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, Shaanxi, China
    2. AVIC Shenyang Aircraft Design and Research Institute, Shenyang 110035, Liaoning, China
  • Received:2022-08-13 Online:2023-06-30
  • Contact: YANG Qiming

Abstract:

To solve the intelligent decision-making problem in the process of UAV air combat, a multi-dimensional decision-making model for UAV intelligent air combat based on the hierarchical reinforcement learning architecture is established, allowing the autonomous decision-making of air combat to be extended from a single-dimensional maneuver decision to a multi-dimensional one including radar switch, active jamming, formation conversion, target detection, target tracking, interference avoidance, weapon selection, etc., so that autonomous decision-making in the main steps of air combat is realized. In order to solve the problems of state-space complexity and low learning efficiency of the decision-making model after the dimension expansion, a meta-strategy group is trained and established with the Soft Actor-Critic algorithm and expert experience, and the traditional Option-Critic algorithm is improved. The strategy termination function is designed and optimized to improve the flexibility of strategy switching and realize seamless multi-dimensional decision-making switching in air combat.. The experimental results show that the proposed method has good countermeasure effectiveness for the multi-dimensional decision-making during the whole process of UAV air combat, which can control the agent to flexibly switch among interference, search, strike, and avoidance strategies according to different battlefield situations with the purpose of improving the performance of traditional algorithms and the efficiency of solving complex decision-making processes.

Key words: UAV air combat, multi-dimensional decision-making, hierarchical reinforcement learning, Soft Actor-Critic algorithm, Option-Critic algorithm