Welcome to Acta Armamentarii ! Today is

Acta Armamentarii ›› 2023, Vol. 44 ›› Issue (11): 3295-3309.doi: 10.12382/bgxb.2023.0810

Special Issue: 群体协同与自主技术

Previous Articles     Next Articles

Multi-agent Reinforcement Learning-based Offloading Decision for UAV Cluster Combat Tasks

LI Jiajian1,2, SHI Yanjun1,2,*(), YANG Yu1,3, LI Bo3,4, ZHAO Xijun3,4   

  1. 1 School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, Liaoning, China
    2 State Key Laboratory of High-performance Precision Manufacturing, Dalian 116024, Liaoning, China
    3 China North Artificial Intelligence & Innovation Research Institute, Beijing 100072, China
    4 Collective Intelligence & Collaboration Laboratory, Beijing 100072, China
  • Received:2023-08-29 Online:2023-11-05
  • Contact: SHI Yanjun

Abstract:

In recent years, the task offloading has been becoming a research hotspot. It is one of the key technologies to ensure the efficient cooperative operations of unmanned aerial vehicle (UAV) cluster, aiming to overcome the constraints of insufficient computing power and limited energy of a single platform. The purpose of reducing cost s and increasing efficiency is achieved by offloading the computing tasks to the servers of edge network for processing. In this paper, the UAV cluster-assisted air-ground integrated cooperative reconnaissance is taken as the combat scenario, and the complex wartime electromagnetic environment and the time-varying network topology of the cluster is considered. The long-term task offloading is decoupled into an online Markov decision process via Lyapunov optimization. To solve the problems of difficult convergence in hybrid action space and low learning efficiency, a multi-agent reinforcement learning offloading decision algorithm driven by data-model bi-level optimization is proposed by combining the convex optimization and multi-agent deep deterministic strategy to solve the power allocation and task allocation problem hierarchically. Numerical experiments show that the proposed algorithm can adaptively adjust the agent task offloading strategy according to the time-varying battlefield environment to improve the performance of traditional algorithm and optimize the complex multi-dimensional objectives.

Key words: UAV cluster, task offload, multi-agent reinforcement learning, convex optimization, Lyapunov optimization

CLC Number: