
浏览全部资源
扫码关注微信
北京理工大学重庆创新中心,重庆 401120
北京理工大学 机械与车辆学院,北京 100081
Received:30 June 2025,
Online First:25 December 2025,
Published:2026-04
移动端阅览
GAO Song, WU Wenbo, JIANG Yuzhu, et al. 3D Detection of Low-altitude Small Objects by Infrared and LiDAR Data Fusion[J]. Acta Armamentarii, 2026, 47(4): 250571.
GAO Song, WU Wenbo, JIANG Yuzhu, et al. 3D Detection of Low-altitude Small Objects by Infrared and LiDAR Data Fusion[J]. Acta Armamentarii, 2026, 47(4): 250571. DOI: 10.12382/bgxb.2025.0571.
针对低空场景下输电线与树枝等细小障碍物使飞行器目标识别与避障所面临的严峻挑战,构建输电线和树木的红外图像数据集,该数据集包含2912张图像和14000个标签。提出一种基于改进YOLOv8的空中目标检测模型,设计多尺度感知特征融合模块以增强对多尺度目标的感知能力,引入自适应阈值焦点损失函数对样本平衡进行优化。基于检测结果融合雷达点云与红外图像,输出目标高精度三维坐标、距离信息及语义类别。基于自采数据集验证,实验结果表明:相比于原始YOLOv8-seg模型,所提目标检测模型的mAP@50-95提升了3.3%,同时检测速度达到170帧/s。将所提模型部署至实验平台进行三维目标距离测试,测距误差小于10cm。上述结果显示,所提模型具有很好的适应性和鲁棒性,可以在飞行器低空环境下有效增强探测和定位复杂背景下细长目标的精确度,识别定位电线和树枝。
Low-altitude flight poses significant challenges for unmanned aerial vehicles (UAVs) in detecting and avoiding thin obstacles such as power lines and tree branches. To address this issue
an infrared image dataset comprising power lines and trees
which consists of 2912 images annotated with 14000labels
is constructed
and an improved UAV-based object detection model based on YOLOv8 is proposed. The model incorporates a multi-scale perceptual feature fusion module (SE-EMA) to enhance the capability to detect multi-scale targets. Furthermore
an adaptive threshold focal loss (ATFL) function is introduced to optimize sample balance. Finally
based on the detection results
the radar point cloud and infrared images are fused to output the high-precision three-dimensional coordinates
distance information and semantic categories of target. The proposed model is verified based on our custom dataset. The proposed model achieves a 3.3 percentage point improvement in mAP@50-95 compared to the original YOLOv8-seg model
while maintaining a detection speed of170frames per second (FPS) .When the proposed model is deployed on a test platform for 3D target distance measurement
the ranging error is less than 10 cm. These results demonstrate that the proposed model exhibits favorable adaptability and robustness in detecting and localizing the power lines and tree branches in low-altitude UAV environments. It effectively enhances the detection and localization accuracies of slender targets against complex backgrounds.
YE K, TANG H D, LIU B W, et al. More clear, more flexible, more precise:a comprehensive oriented object detection benchmark for UAV:arXiv:2504.20032[R].Ithaca, NY, US:Cornell University, 2025:2504.20032.
ZHOU M L, XING R, HAN D L, et al. PDT:UAV target detection dataset for pests and diseases tree:arXiv:2409.15679[R].Ithaca, NY, US:Cornell University, 2024:2409.15679.
LI X X, DIAO W W, MAO Y Q, et al. SCLNet: a scale-robust complementary learning network for object detection in UAV images [J].IEEE Transactions on Geoscience and Remote Sensing, 2024, 62.DOI:10.1109/TGRS.2024.3505425.
XU Y L, WANG W H. A method for single frame detection of infrared dim small target in complex background [J]. Journal of Physics Conference Series, 2020, 1634 (1): 012063.
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection:arXiv:1506.02640[R].Ithaca, NY, US:Cornell University, 2015:1506.02640.
HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, US: IEEE, 2018:7132-7141.
WOO S Y, PARK J C, LEE J Y, et al. Cbam:convolutional block attention module:arXiv:1807.06521[R].Ithaca, NY, US:Cornell University, 2018:1807.06521.
SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM:visual explanations from deep networks via gradient-based localization[C]∥Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy:IEEE, 2017:618-626.
ZHU X Z, CHENG D Z, ZHANG Z, et al. An empirical study of spatial attention mechanisms in deep networks[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea:IEEE, 2019:6688-6697.
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]∥Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017:2980-2988.
ZHANG K P, ZHANG Z P, LI Z F, et al. Joint face detection and alignment using multitask cascaded convolutional networks [J]. IEEE Signal Processing Letters, 2016, 23 (10): 1499-1503.
GAO W Y, CHEN Y T, LIU Y, et al. Distance measurement method for obstacles in front of vehicles based on monocular vision [J]. Journal of Physics:Conference Series, 2021, 1815 (1): 012019.
TERVEN J, CÓRDOVA-ESPARZA D M, ROMERO-GONZÁLEZ J A. A comprehensive review of yolo architectures in computer vision: from YOLOv1 to YOLOv8 and yolo-nas [J]. Machine Learning and Knowledge Extraction, 2023, 5 (4): 1680-1716.
LIN T Y, DOLLÁR P, GIRSHICK R B, et al. Feature pyramid networks for object detection [C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, US:IEEE, 2017:2117-2125.
WANG K X, LIEW J H, ZOU Y T, et al. PANet:few-shot image semantic segmentation with prototype alignment: arXiv:1908.06391[R].Ithaca, NY, US: Cornell University, 2019:1908.06391.
YANG B, ZHANG X Y, ZHANG J, et al. EFLNet: enhancing feature learning network for infrared small target detection [J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62:1-11.
OUYANG D L, HE S, ZHANG G Z, et al. Efficient multi-scale attention module with cross-spatial learning[C]∥Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island, Greece:IEEE, 2023:1-5.
ZHENG Z H, WANG P, REN D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation [J]. IEEE Transactions on Cybernetics, 2021, 52 (8): 8574-8586.
YAN G H, LI Z C, WANG C J, et al. Opencalib:a multi-sensor calibration toolbox for autonomous driving: arXiv: 2205.14087 [R].Ithaca, NY, US:Cornell University, 2022:2205.14087.
REDMON J, FARHADI A. YOLOv3:an incremental improvement [C]∥Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy:IEEE, 2017.
HOLLARD L, MOHIMONT L, GAVEAU N, et al. Leyolo, new scalable and efficient CNN architecture for object detection:2406.14239[R].Ithaca, NY, US: Cornell University, 2024:2406.14239.
HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]∥Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy:IEEE, 2017.
CAI Z W, VASCONCELOS N. Cascade R-CNN:high quality object detection and instance segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43 (5): 14831498.
ZHU X K, LÜ S C, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, BC, Canada:IEEE, 2021:2778-2788.
BAGHBANBASHI M, RAJI M, GHAVAMI B. Quantizing YOLOv7:a comprehensive study[C]∥Proceedings of the 2023 28th International Computer Conference, Computer Society of Iran. Tehran, Iran:IEEE, 2023:01-05.
TIAN Y J, YE Q X, DOERMANN D.YOLOv12:attention-centric real-time object detectors:arXiv:2502.12524 [R].Ithaca, NY, US:Cornell University, 2025:2502.12524.
0
Views
44
下载量
0
CNKI被引量
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024360号