信息工程大学 地理空间信息学院,河南 郑州 450001
军委后勤保障部 军事设施建设局,北京 100039
通信作者邮箱:jueyun1020@126.com
收稿:2025-05-20,
网络首发:2026-01-27,
纸质出版:2026-03
移动端阅览
陈思冶, 李锋, 刘清云, 等. 基于APF-IPPO算法的机器人路径规划[J]. 兵工学报, 2026,47(3):250389.
CHEN Siye, LI Feng, LIU Qingyun, et al. RobotPath Planning Based on APF-IPPO Algorithm[J]. Acta Armamentarii, 2026, 47(3): 250389.
陈思冶, 李锋, 刘清云, 等. 基于APF-IPPO算法的机器人路径规划[J]. 兵工学报, 2026,47(3):250389. DOI: 10.12382/bgxb.2025.0389.
CHEN Siye, LI Feng, LIU Qingyun, et al. RobotPath Planning Based on APF-IPPO Algorithm[J]. Acta Armamentarii, 2026, 47(3): 250389. DOI: 10.12382/bgxb.2025.0389.
针对传统近端策略优化
(
Proximal Policy Optimization,PPO
)
算法在机器人路径规划中存在的奖励稀疏及样本效率低问题,提出一种基于人工势场
(
Artificial Potential Field,APF
)
法与广义优势估计
(
Generalized Advantage Estimation,GAE
)
及混合策略改进的近端策略优化
(
Improved Proximal Policy Optimization,IPPO
)
路径规划APF-IPPO算法。引入广义优势估计改进传统的PPO算法,促进平均奖励的加速收敛,提高优势函数估计的准确性;在动作选择时,提出基于方向相似度的混合策略来动态调整策略网络的动作分布概率;在奖励函数设计方面,设计融合APF的复合奖励函数来缓解稀疏回报问题。为验证算法性能,在多场景仿真实验中与DQN、PPO、A
*
及APF算法进行对比。实验结果表明,APF-IPPO算法在复杂静态环境中能够综合考虑多个性能指标,规划出最优路径,同时展现出更优的泛化能力与动态环境适应性,验证了其在路径规划任务中的有效性与优越性。
The traditional proximal policy optimization(PPO)algorithm has the problems such as sparse rewards and low sample efficiency in robot path planning. This paper proposes an improved PPO path planning algorithm based on artificial potential field(APF)
generalized advantage estimation(GAE)and hybrid strategy optimization
namely APF-IPPO algorithm. GAE is introduced to improve the traditional PPO algorithm
promoting the accelerated convergence of average rewards and enhancing the accuracy of dominant function estimation. A hybrid strategy based on direction similarity is proposed to dynamically adjust the action distribution probability of policy network during action selection. In terms of reward function design
a composite reward function fused with APF is designed to alleviate the sparse reward problem. To verify the performance of APF-IPPO algorithm
the comparative simulations are conducted with DQN
PPO
A
*
and APF algorithms in multiple scenarios. Experimental results demonstrate that the APF-IPPO alg
orithm is capable of comprehensively considering multiple performance metrics to generate the optimal path in complex static environments
while also exhibiting the superior generalization capability and the adaptability to dynamic environments
thereby validating its effectiveness and superiority in path planning tasks.
胡琴,赵一亭,夏方平,等.基于Soft-Actor-Critic算法的机器人局部路径规划算法[J].武汉理工大学学报,2021,43(9):79-84.
HU Q, ZHAO Y T, XIA F P, et al. Robot local path planning algorithm based on Soft-Actor-Critic algorithm[J]. Journal of Wuhan University of Technology, 2021, 43 (9): 79- 84. (in Chinese)
QIN H W,SHAO S L,WANG T,et al. Review of autonomous path planning algorithms for mobile robots[J]. Drones,2023,7:211.
陈智康,刘佳,王丹丹,等.改进Dijkstra机器人路径规划算法研究[J].天津职业技术师范大学学报,2020,30(3):30-35.
CHEN Z K, LIU J, WANG D D, et al. Research on Improved Dijkstra Algorithm for Robot Path Planning[J]. Journal of Tianjin University of Technology and Education,2020,30(3):30-35. (in Chinese)
LIU Q Y,YOU X,ZHANG X, et al. Unmanned vehicle off-road path-planning method with comprehensive constraints on multiple environmental factors[J]. International Journal of Digital Earth, 2024,17(1):2408453.
MA G J,DUAN Y L,LI M Z,et al. A probability smoothing Bi-RRT path planning algorithm for indoor robot[J]. Future Generation Computer Systems,2023,143:349-360.
TANG Y X,ZAKARIA M A,YOUNAS M. Path planning trends for autonomous mobile robot navigation[J]. Sensors,2025,25 (4):1206.
BAI Z K,PANG H,HE Z N,et al. Path planning of autonomous mobile robot in comprehensive unknown environment using deep reinforcement learning[J]. IEEE Internet of Things Journal,2024, 11(12):22153-22166.
VARMA T,SINGH V,TALELE R,et al. Self-driving car simulation using genetic algorithm[J]. International Journal for Research in Applied Science & Engineering,2020,8(4):514-519.
郭威,吴凯,周悦,等.基于蚁群算法的深海着陆车路径规划[J].兵工学报,2022,34(6):1387-1394.
GUO W,WU K,ZHOU Y,et al. Path planning of deep-sea landing vehicle based on ant colony algorithm[J]. Acta Armamentarii, 2022,34(6):1387-1394. (in Chinese)
HE W J, QI X G, LIU L F. A novel hybrid particle swarm optimization for multi-UAV cooperate path planning[J]. Applied Intelligence,2021,51(10):7350-7365.
WANG X, WANG S, LIANG X X, et al. Deep Reinforcement Learning:A Survey[J]. IEEE Transactions on Neural Networks and Learning Systems,2024,35(4):5064-5078.
LIU Y L,CHEN Z G,LI Y G,et al. Robot search path planning method based on prioritized deep reinforcement[J]. International Journal of Control Automation and Systems,2022,20(8):2669-2680.
HAN H Y,WANG J Q,KUANG L Q,et al. Improved robot path planning method based on deep reinforcement learning[J]. Sensors,2023,23(12):5622.
GRZELCZAK M, DUCH P. Deep reinforcement learning algorithms for path planning domain in grid-like environment[J].Applied Sciences-Basel,2021,11:1135.
GARRIDO-CASTANEDA S I,VASQUEZ J I, ANTONIO-CRUZ M. Coverage path planning using actor-critic deep reinforcement learning[J]. Sensors,2025,25(5):1592.
SUN P Q, YANG C X, ZHOU X J, et al. Path planning for unmanned surface vehicles with strong generalization ability based on improved proximal policy optimization[J]. Sensors, 2023, 23:8864.
潘云伟,李敏,曾祥光,等.基于人工势场和改进强化学习的自主式水下潜航器避障和航迹规划[J].兵工学报,2025, 46(4):240300.
PAN Y W,LI M,ZENG X G,et al. AUV obstacle avoidance and path planning based on artificial potential field and improved reinforcement learning[J]. Acta Armamentarii, 2025, 46 (4):240300. (in Chinese)
TANG C Y,LIU C H,CHEN W K,et al. Implementing action mask in proximal policy optimization (PPO) algorithm[J]. ICT Express,2020,6(3):200-203.
WANG X, WANG S, LIANG X X, et al. Deep reinforcement learning:a survey[J]. IEEE Transaction on Neural Networks and Learning Systems,2024,35(4):5064-5078.
CHEN B,GAO C H,ZHANG L,et al. Optimal control algorithm for subway train operation by proximal policy optimization[J]. Applied Sciences,2023,13(13):7456.
SCHULMAN J,WOLSKI F,DHARIWAL P,et al. Proximal policy optimization algorithms:arXiv:1707.06347[R/OL]. Ithaca,NY, US:Cornell University,2017(2017-08-28)[2024-07-25]. https:∥arxiv. org/abs/1707.06347.
WU Z T,DAI J Y,JIANG B P,et al. Robot path planning based on artificial potential field with deterministic annealing[J]. ISA Transactions,2023,138:74-87.
CUI Z W, GUAN W, LUO W Z, et al. Intelligent navigation method for multiple marine autonomous surface ships based on improved PPO algorithm[J]. Ocean Engineering, 2023, 87(1):115783.
0
浏览量
31
下载量
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024360号