欢迎访问《兵工学报》官方网站,今天是

兵工学报

• •    下一篇

Safe-DRL:无人平台安全深度强化学习决策算法

杨帆1,2,李雪原1*(),杜明刚2 ,姜雨彤2 ,刘琦1   

  1. (1. 北京理工大学 机械与车辆学院, 北京 100081; 2.中国北方车辆研究所 槐树岭实验室, 北京 100072)
  • 收稿日期:2025-01-08 修回日期:2025-07-16
  • 通讯作者: *邮箱:lixueyuan@bit.edu.cn
  • 基金资助:
    国家自然科学基金项目(524B2162)

Safe-DRL: A Safety-Conscious Deep Reinforcement Learning Decision-Making Algorithm for Unmanned Platforms

YANG Fan1,2,LI Xueyuan1*(), DU Minggang2, JIANG Yutong2, LIU Qi1   

  1. (1. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China; 2. Chinese Scholartree Ridge SKL, China Northern Vehicle Research Institute, Beijing 100072, China)
  • Received:2025-01-08 Revised:2025-07-16

摘要: 针对传统深度强化学习(Deep Reinforcement Learning, DRL)在推理过程中存在不可预测行为带来的安全性问题,提出一种面向无人平台自动驾驶的多任务场景安全DRL算法。基于改进的马尔可夫过程,引入动作判定网络以实现预执行安全评估,采用并行双线程网络结构有效抑制危险驾驶行为,并结合运动学特性设计了新型奖励函数,以兼顾驾驶安全性与效率。在highway-env环境下,所提算法在单行道、十字路口和环岛三种典型驾驶场景中进行了对比实验。研究结果表明,所提算法显著提升了驾驶安全性和泛化能力,有效支持无人平台在远程部署、物资运输及区域渗透等自动驾驶任务中的应用需求。

关键词: 无人平台, 自动驾驶, 马尔可夫过程, 深度强化学习

Abstract: To address the safety issue of unpredictable behaviors in traditional Deep Reinforcement Learning (DRL) during inference, this paper proposes a safety-enhanced DRL algorithm for autonomous driving in unmanned platforms across multi-task scenarios. The algorithm integrates an improved Markov process with an action evaluation network for pre-execution safety assessment and adopts a parallel dual-thread network architecture to suppress hazardous driving behaviors. Additionally, a novel kinematics-based reward function is designed to balance driving safety and efficiency. Experiments conducted in the highway-env environment across three typical driving scenarios—single-lane roads, intersections, and roundabouts—demonstrate that the proposed algorithm significantly improves driving safety and generalization capability. The results verify its effectiveness and potential for supporting unmanned platform applications such as remote deployment, cargo transportation, and regional penetration.

Key words: unmanned platform, autonomous driving, Markov process, deep reinforcement learning

中图分类号: