Welcome to Acta Armamentarii ! Today is

Acta Armamentarii ›› 2025, Vol. 46 ›› Issue (5): 241146-.doi: 10.12382/bgxb.2024.1146

Previous Articles    

Path Planning Method for Large-scale UAV Swarms Based on Reinforcement Learning Conflict Resolution

ZHOU Zhenlin1,2, LONG Teng1,2,3,4, LIU Dawei5, SUN Jingliang1,2,3,*(), ZHONG Jianxin1,2, LI Junzhi1,2   

  1. 1 School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China
    2 Key Laboratory of Dynamics and Control of Flight Vehicle of Ministry of Education, Beijing 100081, China
    3 Beijing Institute of Technology Chongqing Innovation Center, Chongqing, 401121, China
    4 National Key Laboratory of Land and Air Based Information Perception and Control, Beijing 100081, China
    5 Research and Development Academy of Machinery Equipment, Beijing 100089, China
  • Received:2024-12-24 Online:2025-05-07
  • Contact: SUN Jingliang

Abstract:

In the context of large-scale unmanned aerial vehicle (UAV) swarm cooperative flight scenarios, the high computational time consumption in swarm path planning is caused by frequent path conflicts. Aiming at the problem above,a large-scale UAV swarm path planning method based on reinforcement learning conflict resolution is developed. A dual-layer planning architecture, comprising a high-level layer of conflict resolution and a low-level layer of path planning, is constructed to reduce the spatial and temporal dimensions of path conflicts. At the high-level layer of conflict resolution, a conflict resolution strategy network based on the Rainbow deep Q-networks (DQN) algorithm training framework is designed. This network transforms the resolution process of each path conflict into the action selection process of left and right tree nodes of a binary tree. This approach maps different conflict resolution sequences to their outcomes, thereby reducing the traversal of tree nodes and improving the efficiency of conflict resolution. At the low-level layer of path planning, the time dimension is incorporated into the spatial collision avoidance strategy. A re-planning jump point search (ReJPS) method based on a node re-expansion mechanism is proposed, which increases the feasible planning domain and enhances the ability to resolve the path conflicts. Simulated results indicate that, compared to the path planning methods based on the conflict-based search (CBS)+A* and CBS+ReJPS, the proposed method reduces the average planning time by 86.64% and 19.65%, respectively, while maintaining comparable optimality.

Key words: unmanned aerial vehicle swarm, path planning, deep reinforcement learning, conflict-based search, conflict resolution

CLC Number: