Acta Armamentarii ›› 2025, Vol. 46 ›› Issue (2): 240265-.doi: 10.12382/bgxb.2024.0265
Previous Articles Next Articles
LI Zonggang1,2,*(), HAN Sen1,2, CHEN Yinjuan1,2, NING Xiaogang1,2
Received:
2024-04-09
Online:
2025-02-28
Contact:
LI Zonggang
CLC Number:
LI Zonggang, HAN Sen, CHEN Yinjuan, NING Xiaogang. A Path Planning Algorithm for Mobile Robots Based on Angle Searching and Deep Q-Network[J]. Acta Armamentarii, 2025, 46(2): 240265-.
Add to citation manager EndNote|Ris|BibTeX
Algorithm 1 AS-DQN: |
---|
Initialization Initialize replay memory,Initialize the Q network and target network and other hyperparameters.Initialize S=0. |
1: for S<Smax do |
2: if S≠0 then sk=sk+1 |
3: else get the initial observation sk |
4: end if |
5: if S<pre.step then |
6: random select action ak |
7: else |
8: if μ<$ \epsilon $ then random select action ak |
9: else select ak= Q(sk,a,θ) |
10: end if |
11: end if |
12: if coordinate.ak=FALSE |
13: brake |
14: Store experience ek=(sk,ak,rk,sk+1) |
15: if S<decline.step then |
16: $ \epsilon = \epsilon +0.002$ |
17: end if |
18: if S>pre.step then |
19: Calculate the loss (y-Q(si,ai;θi))2 |
20: Train and update Q network’s weight θi |
21: Every Z steps copy θi to θi+1 |
22: end if |
23: end for |
Table 1 AS-DQN pseudocode
Algorithm 1 AS-DQN: |
---|
Initialization Initialize replay memory,Initialize the Q network and target network and other hyperparameters.Initialize S=0. |
1: for S<Smax do |
2: if S≠0 then sk=sk+1 |
3: else get the initial observation sk |
4: end if |
5: if S<pre.step then |
6: random select action ak |
7: else |
8: if μ<$ \epsilon $ then random select action ak |
9: else select ak= Q(sk,a,θ) |
10: end if |
11: end if |
12: if coordinate.ak=FALSE |
13: brake |
14: Store experience ek=(sk,ak,rk,sk+1) |
15: if S<decline.step then |
16: $ \epsilon = \epsilon +0.002$ |
17: end if |
18: if S>pre.step then |
19: Calculate the loss (y-Q(si,ai;θi))2 |
20: Train and update Q network’s weight θi |
21: Every Z steps copy θi to θi+1 |
22: end if |
23: end for |
参数名称 | 数值 |
---|---|
记忆池 | 20000 |
开始训练的经验数量 | 100 |
处理样本数量 | 32 |
目标网络更新频率 | 100 |
折扣因子γ | 0.9 |
学习率 | 0.001 |
经验回放内存值 | 500 |
选择最大Q值动作的概率ε | 0.01 |
ε最大值 | 1 |
ε增加速率 | 0.002 |
Table 2 Hyperparameters and numerical values of neural network
参数名称 | 数值 |
---|---|
记忆池 | 20000 |
开始训练的经验数量 | 100 |
处理样本数量 | 32 |
目标网络更新频率 | 100 |
折扣因子γ | 0.9 |
学习率 | 0.001 |
经验回放内存值 | 500 |
选择最大Q值动作的概率ε | 0.01 |
ε最大值 | 1 |
ε增加速率 | 0.002 |
机器人 | 收敛步长 | 收敛时间/s |
---|---|---|
R1 | 28000 | 372.9 |
R2 | 30000 | 446.5 |
R3 | 17000 | 258.6 |
Table 3 Mobile robot data of 8×8map
机器人 | 收敛步长 | 收敛时间/s |
---|---|---|
R1 | 28000 | 372.9 |
R2 | 30000 | 446.5 |
R3 | 17000 | 258.6 |
机器人 | 收敛步长 | 收敛时间/s |
---|---|---|
R4 | 109000 | 1510.6 |
R5 | 103000 | 1429.8 |
R6 | 75000 | 1137.2 |
Table 4 Mobile robot data of 12×12map
机器人 | 收敛步长 | 收敛时间/s |
---|---|---|
R4 | 109000 | 1510.6 |
R5 | 103000 | 1429.8 |
R6 | 75000 | 1137.2 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 27000 | 368.5 | 20.68 |
AS-DQN | 20500 | 292.3 |
Table 5 Model data of static obstacle of 8×8map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 27000 | 368.5 | 20.68 |
AS-DQN | 20500 | 292.3 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 28000 | 398.1 | 23.99 |
AS-DQN | 20000 | 302.6 |
Table 6 Model data of dynamic obstacle of 8×8map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 28000 | 398.1 | 23.99 |
AS-DQN | 20000 | 302.6 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 90000 | 1216.9 | 15.75 |
AS-DQN | 75000 | 1025.3 |
Table 7 Model data of static obstacle of 12×12map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 90000 | 1216.9 | 15.75 |
AS-DQN | 75000 | 1025.3 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 109000 | 1510.6 | 16.54 |
AS-DQN | 85500 | 1260.7 |
Table 8 Model data of dynamic obstacle of 12×12map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 109000 | 1510.6 | 16.54 |
AS-DQN | 85500 | 1260.7 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% | |||
---|---|---|---|---|---|---|
DQN | 27000 | 368.5 | ||||
AS-DQN | 20500 | 292.3 | 53.92 | |||
AS-DQN(IIFT) | 12000 | 169.8 | 41.91 |
Table 9 Model data of static obstacle of 8×8map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% | |||
---|---|---|---|---|---|---|
DQN | 27000 | 368.5 | ||||
AS-DQN | 20500 | 292.3 | 53.92 | |||
AS-DQN(IIFT) | 12000 | 169.8 | 41.91 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 28000 | 398.1 | |
AS-DQN | 20000 | 302.6 | 52.65 |
AS-DQN(IIFT) | 12500 | 188.5 | 37.71 |
Table 10 Model data of dynamic obstacle of 8×8map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 28000 | 398.1 | |
AS-DQN | 20000 | 302.6 | 52.65 |
AS-DQN(IIFT) | 12500 | 188.5 | 37.71 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 90000 | 1216.9 | |
AS-DQN | 75000 | 1025.3 | 40.33 |
AS-DQN(IIFT) | 56000 | 726.1 | 29.18 |
Table 11 Model data of static obstacle of 12×12map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 90000 | 1216.9 | |
AS-DQN | 75000 | 1025.3 | 40.33 |
AS-DQN(IIFT) | 56000 | 726.1 | 29.18 |
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 109000 | 1510.6 | |
AS-DQN | 85500 | 1260.7 | 35.54 |
AS-DQN(IIFT) | 65000 | 973.8 | 22.76 |
Table 12 Model data of dynamic obstacle of 12×12 map
算法 | 收敛步长 | 收敛时间/s | 节省时间/% |
---|---|---|---|
DQN | 109000 | 1510.6 | |
AS-DQN | 85500 | 1260.7 | 35.54 |
AS-DQN(IIFT) | 65000 | 973.8 | 22.76 |
[1] |
王旭, 朱其新, 朱永红, 等. 面向二维移动机器人的路径规划算法综述[J]. 计算机工程与应用, 2023, 59(20):51-66.
doi: 10.3778/j.issn.1002-8331.2212-0050 |
doi: 10.3778/j.issn.1002-8331.2212-0050 |
|
[2] |
毛建旭, 贺振宇. 电力巡检机器人路径规划技术及应用综述[J]. 控制与决策, 2023, 38(11):3009-3024.
|
|
|
[3] |
|
[4] |
|
[5] |
|
[6] |
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
郭利进, 李强. 基于改进RRT*算法的移动机器人路径规划[J]. 智能系统学报, 2024, 19(05):1209-1217.
|
|
|
[12] |
|
[13] |
梅艺林, 崔立堃, 胡雪岩. 基于人工势场法的无人车路径规划与避障研究[J]. 兵器装备工程学报, 2024, 45(09):300-306.
|
|
|
[14] |
吴妮妮, 王岫鑫. 移动机器人导航路径的自主学习粒子群规划方法[J]. 机械设计与制造, 2024(7):342-346.
|
|
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
doi: 10.1109/ACCESS.2019.2918703 |
[22] |
|
[23] |
|
[24] |
史殿习, 彭滢璇, 杨焕焕, 等. 基于DQN的多智能体深度强化学习运动规划方法[J]. 计算机科学, 2024, 51(2):268-277.
|
doi: 10.11896/jsjkx.230500113 |
|
[25] |
|
[26] |
|
[27] |
王雅如, 姚得鑫, 刘增力, 等. 基于角度搜索的移动机器人路径规划方法[J]. 系统仿真学报, 2024, 36(7):1643-1654.
doi: 10.16182/j.issn1004731x.joss.23-0407 |
doi: 10.16182/j.issn1004731x.joss.23-0407 |
|
[28] |
|
[1] | HU Mingzhe, LI Xuguang, REN Zhiying, ZENG Shuai. UAV 3D Path Planning Based on A* Algorithm with Improved Heuristic Function [J]. Acta Armamentarii, 2024, 45(S1): 302-307. |
[2] | NIU Yilong, YANG Yi, ZHANG Kai, MU Ying, WANG Qi, WANG Yingmin. Path Planning Method for Unmanned Surface Vessel in On-call Submarine Search Based on Improved DQN Algorithm [J]. Acta Armamentarii, 2024, 45(9): 3204-3215. |
[3] | TIAN Hongqing, MA Mingtao, ZHANG Bo, ZHENG Xunjia. Potential Field Exploring Tree Path Planning for Intelligent Vehicle in Off-road Environment [J]. Acta Armamentarii, 2024, 45(7): 2110-2127. |
[4] | JI Peng, GUO Minghao. Local Path Planning for Unmanned Ground Vehicles Based on Improved Artificial Potential Field Method in Frenet Coordinate System [J]. Acta Armamentarii, 2024, 45(7): 2097-2109. |
[5] | WANG Xiaolong, CHEN Yang, HU Mian, LI Xudong. Robot Path Planning for Persistent Monitoring Based on Improved Deep Q Networks [J]. Acta Armamentarii, 2024, 45(6): 1813-1823. |
[6] | PAN Zuodong, ZHOU Yue, GUO Wei, XU Gaofei, SUN Yu. Path Planning of Tidal Flat Tracked Vehicle Based on CB-RRT* Algorithm [J]. Acta Armamentarii, 2024, 45(4): 1117-1128. |
[7] | MEI Zewei, LI Tianren, ZHU Jialin, SHAO Xingling, DING Tianyun, LIU Jun. A Trajectory Planning Method Based on DQN Variable Dynamic Intelligent Decision [J]. Acta Armamentarii, 2024, 45(12): 4395-4406. |
[8] | DONG Mingze, WEN Zhuanglei, CHEN Xiai, YANG Jiongkun, ZENG Tao. Research on Robot Navigation Method Integrating Safe Convex Space and Deep Reinforcement Learning [J]. Acta Armamentarii, 2024, 45(12): 4372-4382. |
[9] | SUN Pengyao, HUANG Yanyan, WANG Kaisheng. Two-dimensional Global Path Planning Based on Potential Field Enhanced Fireworks Algorithm [J]. Acta Armamentarii, 2024, 45(10): 3499-3518. |
[10] | LU Ying, PANG Lichen, CHEN Yusi, SONG Wanying, FU Yanfang. A Swarm Intelligence Algorithm for UAV Path Planning in Urban Warfare [J]. Acta Armamentarii, 2023, 44(S2): 146-156. |
[11] | JU Shuang, WANG Jing, WANG Hao, ZHOU Meng. Formation Reconfiguration Control of Multiple Mobile Robots with Severe Actuator Faults Based on GWO-WOA [J]. Acta Armamentarii, 2023, 44(S2): 114-125. |
[12] | LI Song, MA Zhuangzhuang, ZHANG Yunlin, SHAO Jinliang. Multi-agent Coverage Path Planning Based on Security Reinforcement Learning [J]. Acta Armamentarii, 2023, 44(S2): 101-113. |
[13] | SU Bo, JIANG Lei, LIU Yufei, XING Boyang, LI Yongyao, TAN Senqi, WANG Zhirui. A Review of Key Technologies for Cross-domain and Trans-medium of Mobile Robotics [J]. Acta Armamentarii, 2023, 44(9): 2556-2567. |
[14] | YIN Yiyi, WANG Xiaofang, ZHOU Jian. Q-Learning-based Multi-UAV Cooperative Path Planning Method [J]. Acta Armamentarii, 2023, 44(2): 484-495. |
[15] | FU Jinbo, ZHANG Dong, WANG Mengyang, ZHAO Junmin. Unmanned Aerial Vehicle Path Planning for Improved Target Positioning Accuracy [J]. Acta Armamentarii, 2023, 44(11): 3394-3406. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||