欢迎访问《兵工学报》官方网站,今天是 分享到:

兵工学报 ›› 2023, Vol. 44 ›› Issue (11): 3465-3477.doi: 10.12382/bgxb.2022.0815

所属专题: 群体协同与自主技术

• • 上一篇    下一篇

基于模型预测与策略学习的智能车辆人机协同控制算法

蒋岩, 丁语嫣, 张兴龙, 徐昕*()   

  1. 国防科技大学 智能科学学院, 湖南 长沙 410073
  • 收稿日期:2022-09-07 上线日期:2023-05-12
  • 通讯作者:
  • 基金资助:
    国家自然科学基金项目(61825305); 国家自然科学基金项目(62003361); 国家自然科学基金项目(U21A20518)

A Human-machine Collaborative Control Algorithm for Intelligent Vehicles Based on Model Prediction and Policy Learning

JIANG Yan, DING Yuyan, ZHANG Xinglong, XU Xin*()   

  1. College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, Hunan, China
  • Received:2022-09-07 Online:2023-05-12

摘要:

针对智能车辆在复杂环境下高机动运动控制的难题,提出一种基于模型预测与策略学习的人机协同控制算法。该算法利用人类驾驶员对环境的理解和综合处理能力在决策规划层面辅助机器进行局部轨迹规划,包括速度调节和动态路径生成,实现决策规划层面的人机协同;面向高机动行驶的车辆在线优化规划与控制存在时效性问题,一方面在局部规划层采用较长采样间隔和简化的动力学模型设计基于模型预测控制的局部轨迹规划方法,以实现高效在线轨迹优化;另一方面在控制层采用基于滚动时域强化学习的学习型预测控制方法在线优化控制策略,以提升在线优化控制的计算效率与适应性。驾驶员在环的山区公路高机动仿真结果表明:新方法能遵从驾驶员的加减速指令和转向指令生成安全、平滑的规划轨迹,而且能够精确控制车辆沿期望轨迹行驶;在人机协同控制模式下,6位驾驶员完成相同驾驶任务的时间比手动驾驶平均缩短了8.3%,转向操作负荷降低了51.1%。

关键词: 智能车辆, 人机协同, 高机动运动, 强化学习, 模型预测控制

Abstract:

A human-machine collaborative control algorithm based on model prediction and policy learning is proposed for the optimal decision-making and high maneuvering motion control of intelligent vehicles in complex environments. The algorithm takes advantage of the human driver’s understanding of the environment and comprehensive processing ability to assist the machine in local trajectory planning at the decision planning level, including speed adjustment and dynamic path generation, to achieve the human-machine collaboration.For the timeliness of the online optimal planning and control of vehicles with high maneuverability, on the one hand, a long sampling interval and a simplified dynamics model are used to design a local trajectory planning method based on model predictive control at the local planning level in order to achieve efficient online trajectory optimization. On the other hand, a learning-based predictive control method based on rolling time-domain reinforcement learning is used to optimize the control strategy in the control layer in order to improve the computational efficiency and adaptability of online optimal control. In the driving simulation on the mountain highway with the driver in the loop, the proposed method not only complies with the driver’s acceleration and deceleration commands and steering commands to generate a safe and smooth planning trajectory for human-machine cooperation, but also can accurately control the vehicle to travel along the desired trajectory in real time. In the human-machine cooperative control mode, the time to complete the same driving task is reduced by 8.3% on average and the steering operation load is reduced by 51.1% compared with the manual driving by six ordinary drivers.

Key words: intelligent vehicles, human-machine collaboration, high maneuvering motion, reinforcement learning, model predictive control

中图分类号: