Path Planning Method for Large-scale UAV Swarms Based on Reinforcement Learning Conflict Resolution

doi:10.12382/bgxb.2024.1146

Abstract

Abstract:

In the context of large-scale unmanned aerial vehicle (UAV) swarm cooperative flight scenarios, the high computational time consumption in swarm path planning is caused by frequent path conflicts. Aiming at the problem above,a large-scale UAV swarm path planning method based on reinforcement learning conflict resolution is developed. A dual-layer planning architecture, comprising a high-level layer of conflict resolution and a low-level layer of path planning, is constructed to reduce the spatial and temporal dimensions of path conflicts. At the high-level layer of conflict resolution, a conflict resolution strategy network based on the Rainbow deep Q-networks (DQN) algorithm training framework is designed. This network transforms the resolution process of each path conflict into the action selection process of left and right tree nodes of a binary tree. This approach maps different conflict resolution sequences to their outcomes, thereby reducing the traversal of tree nodes and improving the efficiency of conflict resolution. At the low-level layer of path planning, the time dimension is incorporated into the spatial collision avoidance strategy. A re-planning jump point search (ReJPS) method based on a node re-expansion mechanism is proposed, which increases the feasible planning domain and enhances the ability to resolve the path conflicts. Simulated results indicate that, compared to the path planning methods based on the conflict-based search (CBS)+A^* and CBS+ReJPS, the proposed method reduces the average planning time by 86.64% and 19.65%, respectively, while maintaining comparable optimality.

Key words: unmanned aerial vehicle swarm, path planning, deep reinforcement learning, conflict-based search, conflict resolution

CLC Number:

V249

ZHOU Zhenlin, LONG Teng, LIU Dawei, SUN Jingliang, ZHONG Jianxin, LI Junzhi. Path Planning Method for Large-scale UAV Swarms Based on Reinforcement Learning Conflict Resolution[J]. Acta Armamentarii, 2025, 46(5): 241146-.

Figures/Tables 17

Fig.1 Grid-based map

Fig.2 Logical diagram of hierarchical search structure

Fig.3 State space of UAV swarms

Fig.4 Design of network

Fig.5 Limitations of the classic JPS algorithm for UAV swarm path planning

Fig.6 Schematic diagram of node re-expansion plan

Table 1 Pseudocode of algorithm

算法:基于节点重扩展机制的跳点搜索法
Input: $g x y p r o = 2$ , $g x y p r o = 3$ ,C_now,Con, Open, Close
Output: ξ_i={ $p i 1$ , $p i 2$ ,…, $p i k$ },k∈Z⁺
1 Get positiong $g x y p r o = 2$ , $g x y p r o = 3$ ;
2 Initialize C_now,Con, Open, Close;
3 While Open≠Ø do
4 Find g_xy=min(Open)
5 Do prune rules:Flag=1
6 while g_xy expands no jump point‖ jump point in Con
7 If g_xy reach boundary
8 Flag=0;
9 Break;
10 else
11 g_xy=diagonal search g_xy
12 End if
13 End while
14 If Flag
15 g_xy from Open to Close, jump point in Open
16 End if
17 If g_xy == $g x y p r o = 3$
18 Return ξ_i={ $p i 1$ , $p i 2$ ,…, $p i k$ },k∈Z⁺
19 End if
20 End while

Table 1 Pseudocode of algorithm

算法:基于节点重扩展机制的跳点搜索法
Input: $g x y p r o = 2$ , $g x y p r o = 3$ ,C_now,Con, Open, Close
Output: ξ_i={ $p i 1$ , $p i 2$ ,…, $p i k$ },k∈Z⁺
1 Get positiong $g x y p r o = 2$ , $g x y p r o = 3$ ;
2 Initialize C_now,Con, Open, Close;
3 While Open≠Ø do
4 Find g_xy=min(Open)
5 Do prune rules:Flag=1
6 while g_xy expands no jump point‖ jump point in Con
7 If g_xy reach boundary
8 Flag=0;
9 Break;
10 else
11 g_xy=diagonal search g_xy
12 End if
13 End while
14 If Flag
15 g_xy from Open to Close, jump point in Open
16 End if
17 If g_xy == $g x y p r o = 3$
18 Return ξ_i={ $p i 1$ , $p i 2$ ,…, $p i k$ },k∈Z⁺
19 End if
20 End while

Fig.7 Illustration of ReJPS algorithm for avoiding path conflicts

Table 2 Conflict detection

航迹冲突坐标	无人机到达时间
(8, 7)	u₁				u₂
	4.24				5.83
(9, 12)	u₄				u₅
	2.83				6.24
(7, 10)	u₃				u₅
	4.83				9.66
(5, 6)	u₁		u₂				u₅
	7.65		9.24				14.49
(3, 7)	u₁	u₂		u₃		u₄		u₅
	10.07	12.66		11.49		14.66		16.90

Fig.8 Illustration of JPS algorithm for avoiding path conflicts

Fig.9 Feasible solution space of path planning

Fig.10 Initial number and cost of path conflict for UAV swarms with different scales

Fig.11 Deconfliction of UAV swarms with different scales

Fig.12 Deconfliction time of UAV swarms with different scales

Table 3 Different methods in Monte Carlo simulation

对比方法	顶层规划方法	底层规划方法
1	CBS	A^*
2	CBS	MSA^*
3	CBS	ReJPS
4	DRLCBS	ReJPS

Fig.13 Comparison of flight times

Fig.14 Comparison of planning times

References 26

[1]	GHOMMAM J, SAAD M, WRIGHT S, et al. Relay manoeuvre based fixed-time synchronized tracking control for UAV transport system[J]. Aerospace Science and Technology, 2020, 103:105887.
[2]	SHAHI T B, XU C Y, NEUPANE A, et al. Machine learning methods for precision agriculture with UAV imagery: a review[J]. Electronic Research Archive, 2022, 30(12):4277-4317.
[3]	KHAN A, GUPTA S, GUPTA S K. Emerging UAV technology for disaster detection, mitigation, response, and preparedness[J]. Journal of Field Robotics, 2022, 39(6):905-955.
[4]	陈亚萍, 王楠, 洪华杰, 等. 面向多无人平台区域监视任务的信息素正向激励栅格方法[J]. 兵工学报, 2023, 44(9):2859-2870. doi: 10.12382/bgxb.2022.0537
	CHEN Y P, WANG N, HONG H J, et al. Pheromone positive incentive grid method for multi-unmanned platform regional surveillance task[J]. Acta Armamentarii, 2023, 44(9):2859-2870. (in Chinese) doi: 10.12382/bgxb.2022.0537
[5]	CHANG G N, FU W X, ZHAO J M, et al. Overview of research on intelligent swarm munitions[J/OL]. Defence Technology, 2024, DOI: https://doi.org/10.1016/j.dt.2024.08.017.
[6]	李军, 陈士超. 无人机蜂群关键技术发展综述[J]. 兵工学报, 2023, 44(9):2533-2545. doi: 10.12382/bgxb.2023.0514
	LI J, CHEN S C. Overview of key technology and its development of drone swarm[J]. Acta Armamentarii, 2023, 44(9):2533-2545. (in Chinese) doi: 10.12382/bgxb.2023.0514
[7]	赵军民, 何浩哲, 王少奇, 等. 复杂环境下多无人机目标跟踪与避障联合航迹规划[J]. 兵工学报, 2023, 44(9):2685-2696. doi: 10.12382/bgxb.2022.0525
	ZHAO J M, HE H Z, WANG S Q, et al. Joint trajectory planning for multiple UAVs target tracking and obstacle avoidance in a complicated environment[J]. Acta Armamentarii, 2023, 44(9):2685-2696. (in Chinese) doi: 10.12382/bgxb.2022.0525
[8]	于连波, 曹品钊, 石亮, 等. 基于改进冲突搜索的多智能体路径规划算法[J]. 航空学报, 2023, 44(增刊1):727648.
	YU L B, CAO P Z, SHI L, et al. An improved conflict-based search algorithm for multi-agent path planning[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(S1):727648. (in Chinese)
[9]	HONG Y K, KIM Y D. Two-stage multicriteria decision-making framework for aircraft conflict resolution[J]. Journal of Aerospace Information Systems, 2023, 20(10):596-604.
[10]	LIU Z X, CAI K Q, XIE J D, et al. A network-based conflict resolution approach for unmanned aerial vehicle operations in dense nonsegregated airspace[J]. IEEE Intelligent Transportation Systems Magazine, 2022, 14(3):212-232.
[11]	VAN DEN BERG J, SNOEYINK J, LIN M C, et al. Centralized path planning for multiple robots: optimal decoupling into sequential plans[C]// Proceedings of Robotics: Science and Systems. Seattle, WA, US: MIT Press, 2009.
[12]	YANG J, YIN D, NIU Y F, et al. Distributed cooperative onboard planning for the conflict resolution of unmanned aerial vehicles[J]. Journal of Guidance, Control, and Dynamics, 2019, 42(2):272-283.
[13]	徐广通, 王祝, 曹严, 等. 动态优先级解耦的无人机集群轨迹分布式序列凸规划[J]. 航空学报, 2022, 43(2):325059. doi: 10.7527/S1000-6893.2021.25059
	XU G T, WANG Z, CAO Y, et al. Dynamic-priority-decoupled UAV swarm trajectory planning using distributed sequential convex programming[J]. Acta Aeronautica et Astronautica Sinica, 2022, 43(2):325059. (in Chinese) doi: 10.7527/S1000-6893.2021.25059
[14]	REN Z Q, RATHINAM S, CHOSET H. CBSS:a new approach for multiagent combinatorial path finding[J]. IEEE Transactions on Robotics, 2023, 39(4):2669-2683.
[15]	SHARON G, STERN R, FELNER A, et al. Conflict-based search for optimal multi-agent pathfinding[J]. Artificial Intelligence, 2015, 219:40-66.
[16]	王子晗, 童向荣. 基于冲突搜索的多智能体路径规划研究进展[J]. 计算机科学, 2023, 50(6):358-368. doi: 10.11896/jsjkx.220800151
	WANG Z H, TONG X R. Research progress of multi-agent path finding based on conflict-based search algorithms[J]. Computer Science, 2023, 50(6):358-368. (in Chinese) doi: 10.11896/jsjkx.220800151
[17]	REN Z Q, RATHINAM S, CHOSET H. A conflict-based search framework for multiobjective multiagent path finding[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20(2):1262-1274.
[18]	BOYARSKI E, FELNER A, STERN R, et al. ICBS:the improved conflict-based search algorithm for multi-agent pathfinding: extended abstract[C]// Proceedings of the 8th Annual International Symposium on Combinatorial Search. Ein Gedi, the Dead Sea, Israel: Israel Science Foundation, 2015:223-225.
[19]	SHARON G, STERN R, FELNER A, et al. Meta-agent conflict-based search for optimal multi-agent path finding[C]// Proceedings of the 5th Annual Symposium on Combinatorial Search. Niagara Falls, Ontario, Canada: Association for the Advancement of Artificial Intelligence, 2012:97-104.
[20]	HUANG T A, KOENIG S, DILKINA B. Learning to resolve conflicts for multi-agent path finding with conflict-based search[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Reston, VA, US:AAAI, 2021:11246-11253.
[21]	周熙栋, 张辉, 陈波. 非结构化场景下基于改进JPS算法的移动机器人路径规划[J]. 控制与决策, 2024, 39(2):474-482.
	ZHOU X D, ZHANG H, CHEN B. Mobile robot path planning based on improved JPS algorithm in unstructured scenarios[J]. Control and Decision, 2024, 39(2):474-482. (in Chinese)
[22]	ZHANG J C, AN Y Q, CAO J N, et al. UAV trajectory planning for complex open storage environments based on an improved RRT algorithm[J]. IEEE Access, 2023, 11:23189-23204.
[23]	MAO T X, DENG H. Path planning of slender tensegrities based on the artificial potential field method[J]. AIAA Journal, 2023, 61(5):2255-2265.
[24]	HESSEL M, MODAYIL J, VAN HASSELT H, et al. Rainbow: combining improvements in deep reinforcement learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New Orleans, LA, US:AAAI, 2018:3215-3222.
[25]	JI S W, XU W, YANG M, et al. 3D convolutional neural networks for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231. pmid: 22392705
[26]	XIANG J, CHEN J, LIU Y C. Hybrid multiscale search for dynamic planning of multi-agent drone traffic[J]. Journal of Guidance, Control, and Dynamics, 2023, 46(10):1963-1974.