Acta Armamentarii ›› 2024, Vol. 45 ›› Issue (12): 4372-4382.doi: 10.12382/bgxb.2023.0982
Previous Articles Next Articles
DONG Mingze, WEN Zhuanglei, CHEN Xiai*(), YANG Jiongkun, ZENG Tao
Received:
2023-09-27
Online:
2024-02-27
Contact:
CHEN Xiai
CLC Number:
DONG Mingze, WEN Zhuanglei, CHEN Xiai, YANG Jiongkun, ZENG Tao. Research on Robot Navigation Method Integrating Safe Convex Space and Deep Reinforcement Learning[J]. Acta Armamentarii, 2024, 45(12): 4372-4382.
Add to citation manager EndNote|Ris|BibTeX
阶段 | 环境 尺寸/m | 静态障碍 物个数 | 动态障碍 物个数 | 动态障碍 物半径/m | 动态障碍物 速度/(m·s-1) |
---|---|---|---|---|---|
1 | 20×30 | 0 | 0 | ||
2 | 20×30 | 10 | 0 | ||
3 | 20×30 | 10 | 5 | 0.2~0.3 | 0.3 |
4 | 20×30 | 10 | 10 | 0.2~0.3 | 0.3 |
5 | 10×10 | 0 | 10 | 0.1~0.4 | 0.3~0.6 |
6 | 10×10 | 0 | 20 | 0.1~0.4 | 0.3~0.6 |
7 | 10×10 | 0 | 30 | 0.1~0.4 | 0.3~0.6 |
Table 1 Staged training environment parameter settings
阶段 | 环境 尺寸/m | 静态障碍 物个数 | 动态障碍 物个数 | 动态障碍 物半径/m | 动态障碍物 速度/(m·s-1) |
---|---|---|---|---|---|
1 | 20×30 | 0 | 0 | ||
2 | 20×30 | 10 | 0 | ||
3 | 20×30 | 10 | 5 | 0.2~0.3 | 0.3 |
4 | 20×30 | 10 | 10 | 0.2~0.3 | 0.3 |
5 | 10×10 | 0 | 10 | 0.1~0.4 | 0.3~0.6 |
6 | 10×10 | 0 | 20 | 0.1~0.4 | 0.3~0.6 |
7 | 10×10 | 0 | 30 | 0.1~0.4 | 0.3~0.6 |
方法 | 成功率/% | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | ||||||||||||||
设计1 | 76.0 | 4.0 | 2.2 | 11.3 | 6.3 | 2.9 | 0.3 | 0 | 1.0 | -0.6 | 11.8 | ||||||||||||
设计2 | 76.0 | 4.0 | 2.2 | 11.3 | 6.3 | 2.8 | 0.2 | 0 | 1.5 | -0.1 | 22.5 | ||||||||||||
设计3 | 90.3 | 9.0 | 5.2 | 15.2 | 8.3 | 2.8 | 0.2 | 0.3 | 1.9 | -0.1 | 7.0 | ||||||||||||
设计4 | 83.0 | 5.0 | 2.6 | 11.7 | 6.3 | 2.2 | 0.4 | 0.5 | 1.3 | -0.2 | 4.8 | ||||||||||||
本文方法 | 89.2 | 5.0 | 2.6 | 12.2 | 6.6 | 2.2 | 0.4 | 0.3 | 1.4 | -0.5 | 4.0 |
Table 2 Stage 2 scenario navigation performance metrics statistics
方法 | 成功率/% | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | ||||||||||||||
设计1 | 76.0 | 4.0 | 2.2 | 11.3 | 6.3 | 2.9 | 0.3 | 0 | 1.0 | -0.6 | 11.8 | ||||||||||||
设计2 | 76.0 | 4.0 | 2.2 | 11.3 | 6.3 | 2.8 | 0.2 | 0 | 1.5 | -0.1 | 22.5 | ||||||||||||
设计3 | 90.3 | 9.0 | 5.2 | 15.2 | 8.3 | 2.8 | 0.2 | 0.3 | 1.9 | -0.1 | 7.0 | ||||||||||||
设计4 | 83.0 | 5.0 | 2.6 | 11.7 | 6.3 | 2.2 | 0.4 | 0.5 | 1.3 | -0.2 | 4.8 | ||||||||||||
本文方法 | 89.2 | 5.0 | 2.6 | 12.2 | 6.6 | 2.2 | 0.4 | 0.3 | 1.4 | -0.5 | 4.0 |
方法 | 成功率/% | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | ||
设计1 | 80 | 4.0 | 2.2 | 11.4 | 6.2 | 2.9 | 0.3 | -0.1 | 1.3 | -0.1 | 16.9 |
设计2 | 79 | 4.0 | 2.1 | 11.4 | 6.2 | 2.9 | 0.2 | -0.1 | 2.2 | 0.0 | 35.9 |
设计3 | 88 | 8.8 | 5.0 | 15.4 | 8.6 | 1.8 | 0.4 | 0.3 | 1.8 | -0.1 | 6.6 |
设计4 | 84 | 4.9 | 2.4 | 11.6 | 6.3 | 2.2 | 0.4 | 0.5 | 1.3 | -0.3 | 4.9 |
本文方法 | 89 | 5.9 | 3.3 | 13.0 | 7.3 | 2.2 | 0.4 | 0.3 | 1.4 | -0.5 | 4.2 |
Table 3 Stage 3 scenario navigation performance metrics statistics
方法 | 成功率/% | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | ||
设计1 | 80 | 4.0 | 2.2 | 11.4 | 6.2 | 2.9 | 0.3 | -0.1 | 1.3 | -0.1 | 16.9 |
设计2 | 79 | 4.0 | 2.1 | 11.4 | 6.2 | 2.9 | 0.2 | -0.1 | 2.2 | 0.0 | 35.9 |
设计3 | 88 | 8.8 | 5.0 | 15.4 | 8.6 | 1.8 | 0.4 | 0.3 | 1.8 | -0.1 | 6.6 |
设计4 | 84 | 4.9 | 2.4 | 11.6 | 6.3 | 2.2 | 0.4 | 0.5 | 1.3 | -0.3 | 4.9 |
本文方法 | 89 | 5.9 | 3.3 | 13.0 | 7.3 | 2.2 | 0.4 | 0.3 | 1.4 | -0.5 | 4.2 |
方法 | 成功率/% | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | ||
设计1 | 81 | 3.9 | 2.2 | 11.2 | 6.5 | 2.9 | 0.3 | -0.1 | 1.7 | -0.4 | 24.7 |
设计2 | 75 | 3.8 | 2.2 | 10.8 | 6.3 | 2.9 | 0.3 | -0.1 | 2.1 | 0.4 | 32.9 |
设计3 | 85 | 9.0 | 5.4 | 15.1 | 8.7 | 1.7 | 0.4 | 0.3 | 1.8 | -0.1 | 6.5 |
设计4 | 84 | 4.8 | 2.5 | 11.5 | 6.5 | 2.2 | 0.4 | 0.6 | 1.3 | -0.3 | 4.8 |
本文方法 | 86 | 5.8 | 3.3 | 12.5 | 7.1 | 2.1 | 0.4 | 0.3 | 1.4 | -0.5 | 4.3 |
Table 4 Stage 4 scenario navigation performance metrics statistics
方法 | 成功率/% | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | ||
设计1 | 81 | 3.9 | 2.2 | 11.2 | 6.5 | 2.9 | 0.3 | -0.1 | 1.7 | -0.4 | 24.7 |
设计2 | 75 | 3.8 | 2.2 | 10.8 | 6.3 | 2.9 | 0.3 | -0.1 | 2.1 | 0.4 | 32.9 |
设计3 | 85 | 9.0 | 5.4 | 15.1 | 8.7 | 1.7 | 0.4 | 0.3 | 1.8 | -0.1 | 6.5 |
设计4 | 84 | 4.8 | 2.5 | 11.5 | 6.5 | 2.2 | 0.4 | 0.6 | 1.3 | -0.3 | 4.8 |
本文方法 | 86 | 5.8 | 3.3 | 12.5 | 7.1 | 2.1 | 0.4 | 0.3 | 1.4 | -0.5 | 4.3 |
奖励函数 | 阶段 | 成功率 | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | |||
5 | 87 | 2.8 | 1.6 | 4.6 | 2.5 | 1.6 | 0.4 | 0.6 | 1.4 | -0.9 | 4.1 | |
rt1 | 6 | 77 | 2.8 | 1.8 | 4.1 | 2.4 | 1.4 | 0.4 | 0.6 | 1.4 | 1.4 | 4.1 |
7 | 68 | 3.1 | 2.1 | 4.2 | 2.5 | 1.4 | 0.4 | 0.5 | 1.3 | -0.7 | 4.3 | |
5 | 87 | 2.9 | 1.7 | 4.6 | 2.5 | 1.6 | 0.4 | 0.6 | 1.4 | -0.9 | 4.0 | |
rt2 | 6 | 75 | 3.0 | 1.8 | 4.3 | 2.4 | 1.5 | 0.4 | 0.6 | 1.4 | -0.8 | 4.2 |
7 | 65 | 2.9 | 2.0 | 3.9 | 2.2 | 1.4 | 0.4 | 0.6 | 1.3 | -0.8 | 4.2 |
Table 5 Reward functions rt1 and rt2 navigation performance statistics in the scenarios at Stages 5 to 7
奖励函数 | 阶段 | 成功率 | 导航时间/s | 导航路程/m | 速度/(m·s-1) | 加速度/(m·s-2) | 加加速度/(m·s-3) | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | 均值 | 标准差 | |||
5 | 87 | 2.8 | 1.6 | 4.6 | 2.5 | 1.6 | 0.4 | 0.6 | 1.4 | -0.9 | 4.1 | |
rt1 | 6 | 77 | 2.8 | 1.8 | 4.1 | 2.4 | 1.4 | 0.4 | 0.6 | 1.4 | 1.4 | 4.1 |
7 | 68 | 3.1 | 2.1 | 4.2 | 2.5 | 1.4 | 0.4 | 0.5 | 1.3 | -0.7 | 4.3 | |
5 | 87 | 2.9 | 1.7 | 4.6 | 2.5 | 1.6 | 0.4 | 0.6 | 1.4 | -0.9 | 4.0 | |
rt2 | 6 | 75 | 3.0 | 1.8 | 4.3 | 2.4 | 1.5 | 0.4 | 0.6 | 1.4 | -0.8 | 4.2 |
7 | 65 | 2.9 | 2.0 | 3.9 | 2.2 | 1.4 | 0.4 | 0.6 | 1.3 | -0.8 | 4.2 |
[1] |
|
[2] |
|
[3] |
|
[4] |
|
[5] |
王霄龙, 陈洋, 胡棉, 等. 基于改进深度Q网络的机器人持续监测路径规划[J]. 兵工学报, 2024, 45(6):1813-1823.
doi: 10.12382/bgxb.2023.0227 |
|
|
[6] |
董豪, 杨静, 李少波, 等. 基于深度强化学习的机器人运动控制研究进展[J]. 控制与决策, 2022, 37(2):278-292.
|
|
|
[7] |
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
黄昱洲, 王立松, 秦小麟. 一种基于深度强化学习的无人小车双层路径规划方法[J]. 计算机科学, 2023, 50(1):194-204.
doi: 10.11896/jsjkx.220500241 |
doi: 10.11896/jsjkx.220500241 |
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
[18] |
|
[19] |
|
[20] |
|
[21] |
|
[22] |
|
[23] |
|
[24] |
|
[25] |
|
[1] | CHEN Qi, QIN Guoyang. Trajectory Tracking Control for Hybrid-driven Unmanned Underwater Vehicles with Free-flying and Crawling Dual-mode [J]. Acta Armamentarii, 2024, 45(9): 3216-3229. |
[2] | SUN Hao, LI Haiqing, LIANG Yan, MA Chaoxiong, WU Han. Dynamic Penetration Decision of Loitering Munition Group Based on Knowledge-assisted Reinforcement Learning [J]. Acta Armamentarii, 2024, 45(9): 3161-3176. |
[3] | REN Hongbin, SUN Jiyu, Chih-Keng CHEN, ZHAO Yuzhuang, YANG Lin. LTV-MPC-based Real-time and Anti-noise Motion Control for High-speed Vehicle [J]. Acta Armamentarii, 2024, 45(12): 4311-4322. |
[4] | XING Boyang, XU Wei, LI Yufeng, ZHAO Haoyu, WANG Kang, YAN Tong. Model Predictive Control for Wheeled L-quadruped Robots Based on Hierarchical Decoupling [J]. Acta Armamentarii, 2024, 45(12): 4272-4282. |
[5] | WANG Xu, GAO Xiaoyu, HUANG Ying, CUI Tao, LUO Chengliang. Power Coordinated Predictive Control of Hybrid Amphibious Vehicle with Model Mismatch [J]. Acta Armamentarii, 2024, 45(12): 4578-4588. |
[6] | FU Yanfang, LEI Kailin, WEI Jianing, CAO Zijian, YANG Bo, WANG Wei, SUN Zelong, LI Qinjie. A Hierarchical Multi-Agent Collaborative Decision-making Method Based on the Actor-critic Framework [J]. Acta Armamentarii, 2024, 45(10): 3385-3396. |
[7] | WANG Tianxiang, CUI Tao, ZHANG Fujun, ZHAO Yankai. MPC-based Intake Pressure Control of Electric Compound Supercharged Diesel Engine [J]. Acta Armamentarii, 2024, 45(10): 3642-3653. |
[8] | LIU Jiangtao, ZHOU Lelai, LI Yibin. Trajectory Tracking and Obstacle Avoidance Control of Six-wheel Independent Drive and Steering Robot in Complex Terrain [J]. Acta Armamentarii, 2024, 45(1): 166-183. |
[9] | LI Caoyan, GUO Zhenchuan, ZHENG Dongdong, WEI Yanling. Multi-robot Cooperative Formation Based on Distributed Model Predictive Control [J]. Acta Armamentarii, 2023, 44(S2): 178-190. |
[10] | CAO Zijian, SUN Zelong, YAN Guochuang, FU Yanfang, YANG Bo, LI Qinjie, LEI Kailin, GAO Linghang. Simulation of Reinforcement Learning-based UAV Swarm Adversarial Strategy Deduction [J]. Acta Armamentarii, 2023, 44(S2): 126-134. |
[11] | XU Peng, XING Boyang, LIU Yufei, LI Yongyao, ZENG Yi, ZHENG Dongdong. Anti-disturbance Composite Controller Design of Quadruped Robot Based on Extended State Observer and Model Predictive Control Technique [J]. Acta Armamentarii, 2023, 44(S2): 12-21. |
[12] | ZHANG Yuanbo, XIANG Changle, WANG Weida, CHEN Yongdan. A Particle Swarm Optimization and Ant Colony Optimization Fusion Algorithm-based Model Predictve Torque Coordnation Control Strategy for Distributed Electric Drive Vehicle [J]. Acta Armamentarii, 2023, 44(11): 3253-3258. |
[13] | JIANG Yan, DING Yuyan, ZHANG Xinglong, XU Xin. A Human-machine Collaborative Control Algorithm for Intelligent Vehicles Based on Model Prediction and Policy Learning [J]. Acta Armamentarii, 2023, 44(11): 3465-3477. |
[14] | TANG Zeyue, LIU Haiou, XUE Mingxuan, CHEN Huiyan, GONG Xiaojie, TAO Junfeng. Trajectory Tracking Control of Dual Independent Electric Drive Unmanned Tracked Vehicle Based on MPC-MFAC [J]. Acta Armamentarii, 2023, 44(1): 129-139. |
[15] | SONG Jiarui, TAO Gang, LI Derun, ZANG Zheng, WU Shaobin, GONG Jianwei. Robust Model Predictive Control for Manned and Unmanned Vehicle Formation Based on Parameter Self-Optimization [J]. Acta Armamentarii, 2023, 44(1): 84-97. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||