军事科学院 防化研究院, 北京 102201
* 邮箱: zuoqinwen@163.com
收稿:2025-06-12,
网络首发:2026-02-03,
纸质出版:2025
移动端阅览
田旭光, 左钦文, 王丁, 等. 基于融合A*启发式搜索的深度强化学习路径规划算法研究[J]. 兵工学报, 2025,46(S2):250493.
Xuguang TIAN, Qinwen ZUO, Ding WANG, et al. A Path Planning Algorithm based on Deep Reinforcement Learning with Integrated A* Heuristic Search[J]. Acta Armamentarii, 2025, 46(S2): 250493.
田旭光, 左钦文, 王丁, 等. 基于融合A*启发式搜索的深度强化学习路径规划算法研究[J]. 兵工学报, 2025,46(S2):250493. DOI: 10.12382/bgxb.2025.0493.
Xuguang TIAN, Qinwen ZUO, Ding WANG, et al. A Path Planning Algorithm based on Deep Reinforcement Learning with Integrated A* Heuristic Search[J]. Acta Armamentarii, 2025, 46(S2): 250493. DOI: 10.12382/bgxb.2025.0493.
为解决智能体在复杂环境下进行高效路径规划问题
提出一种结合A
*
启发式搜索的深度强化学习路径规划算法。通过整合目标网络、优势网络及噪声网络
构建一种改进型深度Q网络(Deep Q-learning Network
DQN)模型;提出了融合A
*
先验知识的优先经验缓存机制
以使智能体能够更高效地利用历史经验进行学习
并减少初期探索的盲目性;设计了融合A
*
启发式算法的动作选择策略
在动作选择过程中引入了A
*
算法的启发式搜索
以避免陷入局部最优解
从而提升路径规划的效率和稳定性。通过对算法关键要素的设计与优化
并在多种模拟环境中进行实验验证
同时与传统路径规划算法进行对比分析
表明新提出的算法在路径规划效率、稳定性等方面与传统DQN算法相比均具有较大提升。
To address the challenge of efficient path planning for agents in complex environments
this paper proposes a deep reinforcement learning path planning algorithm that integrates A
*
heuristic search. An enhanced deep Q-learning network (DQN) model is constructed by combining target networks
advantage networks and noise networks.A priority experience cache mechanism that incorporates A
*
prior knowledge is presented
enabling the agents to leverage the historical experience more effectively and reduce the initial exploration blindness.Additionally
an action selection strategy that incorporates A
*
heuristic search is designed.The A
*
algorithm’s heuristic search is introduced during action selection to avoiding local optima
thereby improving the efficiency and stability of path planning.The proposed algorithm is experimentally verified in multiple simulation environments
and compared with the traditional path planning algorithms.The results demonstrate that the proposed algorithm has significant improvements in path planning efficiency and stability compared to the conventional DQN algorithms.
刘小玲 , 李辉 , 郭治国 . 基于狄克斯特拉算法的车间动态生产能力评估与实现 [J ] . 微计算机信息 , 2006 ( 12 ): 96 - 98 .
LIU X L , LI H , GUO Z G . Workshop dynamic production capacity evaluation and implementation based on Dijkstra’s algorithm [J ] . Microcomputer Information , 2006 ( 12 ): 96 - 98 . (in Chinese)
张亚萌 , 王钧 , 符朝兴 . 基于改进A * 算法的AGV路径规划研究 [J ] . 青岛大学学报(工程技术版) , 2024 , 39 ( 3 ): 13 - 19 .
ZHANG Y M , WANG J , FU Z X . Research on AGV path planning based on improved A * algorithm [J ] . Journal of Qingdao University (Engineering & Technology Edition ), 2024 , 39 ( 3 ): 13 - 19 . (in Chinese)
CHOSET H , LYNCH K , HUTCHINSON S , et al. Principles of robot motion:theory,algorithms,and implementations [M ] . Cambridge,MA ,US:MIT Press:512.
赵晓鹏 , 王国权 . 基于改进人工势场法的车辆编队避障研究 [J ] . 电子测量技术 , 2025 , 48 ( 18 ): 13 - 19 .
ZHAO X P , WANG G Q . Research on vehicle formation obstacle avoidance based on the improved artificial potential field method [J ] . Electronic Measurement Technology , 2025 , 48 ( 18 ): 13 - 19 . (in Chinese)
王海洋 , 曹勇敢 , 汪小巍 . 基于概率采样的EIT * 算法与人工势场法的柚类采摘双臂机器人运动规划研究 [J ] . 浙江理工大学学报(自然科学版) , 2025 , 53 ( 5 ): 720 - 730 .
WANG H Y , CAO Y G , WANG X W . Research on motion planning of pomelo-picking dual- arm robot based on probability sampling EIT * algorithm and artificial potential field method [J ] . Journal of Zhejiang Sci-Tech University (Natural Science Edition) , 2025 , 53 ( 5 ): 720 - 730 . (in Chinese)
LAVALLE S . Rapidly-exploring random trees:a new tool for path planning [J ] . The Annual Research Report , 1998 : 11 - 39 .
胡继弘 . 基于改进蚁群算法的AUV路径规划研究 [D ] . 大连 : 大连海事大学 , 2024 .
HU J H . Research on AUV path planning based on improved ant colony algorithm [D ] . Dalian : Dalian Maritime University , 2024 . (in Chinese)
陈明浩 . 基于混合遗传蚁群算法的自动化码头路径规划研究 [D ] . 上海 : 华东师范大学 , 2024 .
CHEN M H . Research on automated terminal path planning based on hybrid genetic ant colony algorithm [D ] . Shanghai : East China Normal University , 2024 . (in Chinese)
SUTTON R S , BARTO A G . Introduction to reinforcement learning [M ] . Cambridge,MA , US : MIT Press ,1998.
LIU W T , SU S , TANG T , et al. A DQN-based intelligent control method for heavy haul trains on long steep downhill section [J ] . Transportation Research Part C:Emerging Technologies , 2021 , 129 : 103249 .
KOVARI B , ANGYAL B G , BECSI T . Deep reinforcement learning combined with RRT for trajectory tracking of autonomous vehicles [J ] . Transportation Research Procedia , 2024 , 78 : 246 - 253 .
ZHOU S Y , LIU X , XU Y F , et al. A deep Q-network (DQN) based path planning method for mobile robots [C ] //Proceedings of the International Conference on Information and Automation.Wuyishan, China:IEEE , 2018 : 366 - 371 .
VAN HASSELT H , GUEZ A , SIlVER D . Deep reinforcement learning with double Q-learning [J ] . Proceedings of the AAAI Conference on Artificial Intelligence.Phoenix,AZ US:AAAI Press , 2016 , 30 ( 1 ): 2094 - 2100 .
LUO L , ZHAO N , ZHU Y , et al. A * guiding DQN algorithm for automated guided vehicle pathfinding problem of robotic mobile fulfillment systems [J ] . Computers & Industrial Engineering , 2023 , 178 : 109112 .
郑晨炜 , 侯凌燕 . 基于改进DQN的动态避障路径规划 [J ] . 北京信息科技大学学报(自然科学版) , 2024 , 39 ( 5 ): 14 - 22 .
ZHENG C W , HOU L Y . Dynamic obstacle avoidance path planning based on improved DQN [J ] . Journal of Beijing Information Science & Technology University (Natural Science Edition) , 2024 , 39 ( 5 ): 14 - 22 . (in Chinese)
LI L H , ZHANG S J , DU D L . Path planning via an improved DQN-based learning policy [J ] . IEEE Access , 2019 , 7 : 67319 - 67330 .
ZHANG F , GU C C , YANG F . An improved algorithm of robot path planning in complex environment based on double DQN [J ] . Lecture Notes in Electrical Engineering Advances in Guidance,Navigation and Control , 2021 , 644 : 303 - 313 .
SCHAUL T , QUAN J I , ANTONOGLOU L , et al. Prioritized experience replay:arXiv:1511.05952 [R ] . Ithaca,NY , US : Cornell University , 2015 : 1511 .05952.
MNIH V , KAVUKCUOGLU K , SILVER D , et al. Human-level control through deep reinforcement learning [J ] . Nature , 2015 , 518 : 529 - 533 .
WANG Z Y , SCHAUL T , HESSEL M . Dueling network architectures for deep reinforcement learning:arXiv:1511.06581 [R ] . Ithaca,NY , US : Cornell University , 2015 : 1511 .06581.
FORTUNATO M , AZAR M G , PIOT B , et al. Noisy networks for exploration [C ] //Proceedings of the 6th International Conference on Learning Representations.Vancouver,BC,Canada:OpenReview. net , 2018 .
HESSEL M , MODAYIL J , VAN HASSELT H , et al. Rainbow:Combining improvements in deep reinforcement learning arXiv:1710.02298 [R ] . Ithaca,NY , US : Cornell University , 2017 : 1710 .02298.
LIN L J Reinforcement learning for robots using neural networks [R ] . Pittsburgh,PA , US : Carnegie-Mellon University ,1993.
0
浏览量
0
下载量
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024360号