欢迎访问《兵工学报》官方网站,今天是

兵工学报

• •    下一篇

基于航空图像与地面点云的跨模态地点识别方法

张志超1,徐友春1*(),陈晋生1**(),穆巍炜2,陆峰1   

  1. (1. 陆军军事交通学院 军事交通运输研究所 天津 300161;2. 空装驻天津地区第一军事代表室 天津 300385)
  • 收稿日期:2024-12-12 修回日期:2025-03-22
  • 通讯作者: *xu56419@126.com;**devil_cjs@163.com

Cross-modal Place Recognition Method Based on Aerial Images and Ground Point Clouds

ZHANG Zhichao1, XU Youchun1*(), CHEN Jinsheng1**(), MU Weiwei2, LU Feng1   

  1. (1. Research Institute of Military Transportation, Army Military Transportation University, Tianjin 300161, China; 2. First Military Representative Office of Air Force Equipment in Tianjin Area, Tianjin 300385, China)
  • Received:2024-12-12 Revised:2025-03-22

摘要: 为解决卫星拒止条件下无人车未知地域探索时的定位问题,提出了从航空照片到地面点云的跨模态地点识别方案,设计了一种空地协同地点识别网络架构AG-PRNet。跨模态地点识别面临着多模态数据低维特征异构差异显著、旋转平移明显等挑战,本文首先进行数据预处理将点云投影到BEV空间减小与航空图像的异构差异,再设计旋转平移不变特征编码模块RATI-CNN提取多模态数据的旋转平移不变特征,最后通过交叉注意力模块融合学习多模态数据的共享特征,有效提高了跨模态数据特征匹配的鲁棒性。此外,构建了跨模态地点识别数据集CDPR Dataset,并采用平均召回率作为评价指标,在该数据集上进行了对比试验与消融实验,。实验表明,本文所提出的跨模态地点识别方法Top-1召回率达到了60.08%,Top-5召回率达到了76%以上,且与其它方法对比具有明显的优势。

关键词: 跨模态, 航空照片, 地面点云, 地点识别, 旋转平移不变

Abstract: To address the localization problem of unmanned vehicles exploring unknown areas under satellite-denied conditions, a cross-modal place recognition approach from aerial images to ground point clouds has been proposed. An Air-Ground Collaborative Place Recognition Network architecture, AG-PRNet, was designed for this purpose. Cross-modal place recognition faces significant challenges such as substantial heterogeneity in low-dimensional features across different modalities and noticeable rotation and translation variations.This paper first performs data preprocessing by projecting point clouds into the Bird's Eye View (BEV) space to reduce heterogeneity differences with aerial images. Then, a rotation and translation invariant feature encoding module(RATI-CNN) is designed to extract rotation and translation invariant features from multi-modal data. Finally, a cross-attention module is used to fuse and learn shared features across multi-modal data, effectively enhancing the robustness of cross-modal feature matching.A cross-modal place recognition dataset, CDPR Dataset, was constructed, and comparative experiments and ablation studies were conducted on this dataset using mean recall rate as the evaluation metric. Experimental results show that the proposed cross-modal place recognition method achieved a Top-1 recall rate of 60.08% and a Top-5 recall rate exceeding 76%, demonstrating a clear advantage over other methods.

Key words: Cross-modal, Aerial Images, Ground Point Clouds, Place Recognition, Rotation and Translation Invariance