欢迎访问《兵工学报》官方网站,今天是

兵工学报 ›› 2024, Vol. 45 ›› Issue (11): 3903-3914.doi: 10.12382/bgxb.2023.1182

• • 上一篇    下一篇

基于DDPG的变外形航天飞行器碰撞规避的轨迹规划方法

丁天雲1,2, 夏逸1,2, 梅泽伟1,2, 邵星灵2,3,*(), 刘俊1,2   

  1. 1 中北大学 仪器与电子学院, 山西 太原 030051
    2 中北大学 仪器科学与动态测试教育部重点实验室, 山西 太原 030051
    3 中北大学 电气与控制工程学院, 山西 太原 030051
  • 收稿日期:2023-12-11 上线日期:2024-04-22
  • 通讯作者:
  • 基金资助:
    国家自然科学基金项目(12345678); 国家自然科学基金项目(23456789)

A DDPG-based Trajectory Planning Method for Collision Avoidance of Morphing Spacecraft

DING Tianyun1,2, XIA Yi1,2, MEI Zewei1,2, SHAO Xingling2,3,*(), LIU Jun1,2   

  1. 1 School of Instrument and Electronics, North University of China, Taiyuan 030051, Shanxi, China
    2 Key Laboratory of Instrumentation Science & Dynamic Measurement of Ministry of Education, North University of China,Taiyuan 030051, Shanxi, China
    3 School of Electrical and Control Engineering, North University of China, Taiyuan 030051, Shanxi, China
  • Received:2023-12-11 Online:2024-04-22

摘要:

针对变外形航天飞行器制导与变形决策强耦合问题,提出了基于深度确定性策略梯度(Deep Deterministic Policy Gradient, DDPG)变外形碰撞规避的轨迹规划方法。依托变形参量建立变外形航天飞行器运动学模型,设计具有射程误差校正功能的纵向制导律和基于视线角偏差的横向制导律,实现绕飞障碍物并保证制导精度。建立适用于连续变外形的马尔可夫决策模型,以攻角、马赫数以及飞行器与障碍物的相对距离为状态空间,设计考虑碰撞的势场惩罚函数及满足制导精度的奖励函数,并构建DDPG网络实现状态空间到动作的尺度变换,得到最优外形决策指令。仿真结果表明:与固定外形航天飞行器相比,通过对外形最优决策,提高了航天飞行器制导精度和横向避障能力,降低了对机载雷达感知能力的要求,节省了感知成本。

关键词: 变外形航天飞行器, 深度确定性策略梯度, 智能决策, 轨迹规划, 碰撞规避

Abstract:

To address the problem of the strong coupling between the guidance and morphing decision of morphing spacecraft, a morphing collision avoidance trajectory planning method of considering obstacle constraint based on deep deterministic policy gradient (DDPG) is proposed. A kinematic model of morphing aerospace craft is established according to morphing parameter. A longitudinal guidance law with a range error correction function and a lateral guidance law based on line-of-sight angle deviation are designed to realize the obstacle circumvention and ensure the terminal guidance accuracy. Then a Markov decision model is constructed to facilitate a continuous morphing. The angle of attack, Mach, and relative distance from the spacecraft to the obstacle are taken as the state space. The potential field penalty function considering collision and the smallest terminal guidance error reward function is considered in the design. The DDPG network is then trained to generate a map of decision instruction from the state space and obtain the optimal shape decision instruction. The simulated results show that, compared with configuration-fixed spacecraft, the guidance accuracy and lateral obstacle avoidance ability of morphing spacecraft are improved by optimizing the shape, and the requirement for the detection ability of air borne radar is reduced to save the detection cost.

Key words: morphing spacecraft, deep deterministic policy gradient, intelligent decision-making, trajectory planning, collision avoidance

中图分类号: