面向无人机航拍小目标的跨层动态检测网络

doi:10.12382/bgxb.2025.0399

摘要/Abstract

摘要：

针对无人机目标检测面临目标尺度极端变化、小目标高密度遮挡及复杂背景干扰等挑战，提出基于改进YOLOv10的无人机航拍小目标检测的跨层动态检测网络。通过设计双分支跨层特征融合金字塔网络替换原金字塔网络结构，解决传统方法对小目标细节保留不足的问题；设计通道混洗深度上采样模块，将通道混洗操作与深度可分离卷积结合，通过高频残差增强小目标边缘特征；采用端到端动态检测头替代原有的检测头，引入动态加权机制，使得每个位置的特征表示能够根据上下文信息自适应调整。实验结表明：所提检测网络在VisDrone2019验证集上的mAP0.5和mAP0.5：0.95分别达到53.3 %和33.2%，较YOLOv10s分别提升了12.7%和9%，模型参数量减少了23.7%，FPS达到79。所提算法在保证良好的推理速度上显著提高了检测精度，具有较大的实用意义。

关键词: 小目标检测, 航拍图像, 特征融合, YOLOv10, 无人机, 通道混洗

Abstract:

To address the challenges of extreme scale variations，dense occlusion of small targets，and complex background interference in unmanned aerial vehicle （UAV）-based target detection，this paper proposes a cross-layer dynamic detection network based on an improved YOLOv10 for the detection of small target via UAV aerial photography.A dual-branch cross-layer feature fusion pyramid network for replacing the original pyramid network is designed to resolve the problem of insufficient detail preservation for small targets in traditional methods.A channel-shuffling depth-wise upsampling module is developed，which combines channel shuffle operations with depth-wise separable convolutions and enhances the edge features of small targets through high-frequency residual connections.An end-to-end dynamic detection head is adopted to replace the original detection head，and a dynamic weighting mechanism is introduced，which enables the adaptive adjustment of feature representations at each position based on contextual information.Experimental results show that the proposed detection network achieves mAP0.5 of 53.3% and mAP0.5：0.95 of 33.2% on the VisDrone 2019 validation set，which are improveed by 12.7% and 9% ，respectively，compared to YOLOv10s，while reduces the model parameters by 23.7% and achieves an FPS of 79.The proposed algorithm significantly enhances the detection accuracy while maintaining excellent inference speed.

Key words: small target detection, aerial image, feature fusion, YOLOv10, unmanned aerial vehicle, channel shuffle

李科廷, 赵子杰, 应展烽, 沈诗淇. 面向无人机航拍小目标的跨层动态检测网络[J]. 兵工学报, 2025, 46(S1): 250399-.

LI Keting, ZHAO Zijie, YING Zhanfeng, SHEN Shiqi. Cross-layer Dynamic Detection Network for Small Target Detection in Aerial Photography[J]. Acta Armamentarii, 2025, 46(S1): 250399-.

图/表 18

参考文献 25

[1]	AL-LQUBAYDHI N, ALENEZI A, ALANAZI T, et al. Deep learning for unmanned aerial vehicles detection:A review[J]. Computer Science Review, 2024, 51:100614.
[2]	吴一全, 童康. 基于深度学习的无人机航拍图像小目标检测研究进展[J]. 航空学报, 2025, 46(3):181-207
	WU Y Q, TONG K. Research progress of small target detection in UAV aerial images based on deep learning[J]. Journal of Aeronautical Science, 2025, 46(3):181-207. (in Chinese)
[3]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the IEEE conference on computer vision and pattern recognition.Columbus,OH, US:IEEE,2014:580-587.
[4]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN:Towards real-time object detection with region pro-posal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[5]	DUAN K W, BAI S, XIE L X, et al. CenterNet:Key-point triplets for object detection[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV).Seoul,Korea:IEEE,2019:6568-6577.
[6]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:Unified,real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,US:IEEE,2016:779-788.
[7]	REDMON J, FARHADI A. YOLO9000:Better,faster,stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu,HI,US:IEEE,2017:6517-6525.
[8]	REDMON J, FARHADI A.YOLOv3:An incremental improvement[Z].arXiv:1804.02767,2018.
[9]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4:Optimal speed and accuracy of object detection[Z].arXiv:2004.10934,2020.
[10]	WANG A, CHEN H, LIU L H, et al. Yolov10:Real-time end-to-end object detection[C]// Proceedings of the 38th International Conference on Neural Information Processing System.Red Hook,NY, US: Curran Associates Inc.,2024:107984-108011.
[11]	冯迎宾, 郭枭尊, 晏佳华. 基于多尺度注意力机制的无人机小目标检测算法[J]. 兵工学报, 2025, 46(1):14-23.
	FENG Y B, GUO X Z, YAN J H. Small UVA target detection algorithm based on multi-scale attention mechanism[J]. Acta Armamentarii, 2025, 46(1):14-23. (in Chinese)
[12]	赵海丽, 许修常, 潘宇航. 基于改进YOLOv7-tiny的车辆目标检测算法[J]. 兵工学报, 2025, 46(4):103-113.
	ZHAO H L, XU X C, PAN Y H. Vehicle target detection algorithm based on improved YOLOv7-tiny[J]. Acta Armamentarii, 2025, 46(4):103-113 (in Chinese)
[13]	ZUO G B, ZHOU K, WANG Q. UAV-to-UAV small target detection method based on deep learning in complex scenes[J]. IEEE Sensors Journal, 2025, 25(2):3806-3820.
[14]	WU D L, GAO Q. Intelligent detection method of small targets in UAV based on attention mechanism and edge enhancement filtering[J]. Alexandria Engineering Journal, 2025, 115:201-209.
[15]	LU Y, SUN M H. Lightweight multidimensional feature enhancement algorithm LPS-YOLO for UAV remote sensing target detection[J]. Scientific Reports, 2025, 15(1):1340.
[16]	于傲泽, 魏维伟, 王平, 等. 基于分块复合注意力的无人机小目标检测算法[J]. 航空学报, 2024, 45(14):1-11.
	YU A Z, WEI W W, WANG P, et al. An unmanned aerial vehicle small target detection algorithm based on block composite attention[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14):1-11. (in Chinese)
[17]	RAHMAN M M, MUNIR M, MARCULESCU R. Emcad:Efficient multi-scale convolutional attention decoding for medical image segmentation[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,US:IEEE,2024:11769-11779.
[18]	DAS D, NAYAK D R, BHANDARY S V, et al. CDAM-Net:Channel shuffle dual attention based multi-scale CNN for efficient glaucoma detection using fundus images[J]. Engineering Applications of Artificial Intelligence, 2024, 133:108454.
[19]	DAI X Y, CHEN Y P, XIAO B, et al. Dynamic head: Unifying object detection heads with attentions[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville,TN,US:IEEE,2021:7373-7382.
[20]	XIONG Y W, LI Z Q, CHEN Y T, et al. Efficient deformable convnets: Rethinking dynamic and sparse operator for vision applications[C]//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,US:IEEE,2024:5652-5661.
[21]	刘延芳, 佘佳宇, 袁秋帆, 等. 无人机遥感图像实时小目标检测方法[J]. 航空学报, 2024, 45(14):59-78.
	LIU Y F, SHU J Y, YUAN Q F, et al. Real-time small object detection method for unmanned aerial vehicle remote sensing images[J]. Acta Aeronautica et Astronautica Sinica, 2024, 45(14):59-78. (in Chinese)
[22]	QIN J, YU W H, FENG X X, et al. A UAV aerial image target detection algorithm based on YOLOv7 improved model[J]. Electronics, 2024, 13(16):3277.
[23]	ZHANG H, SUN W, SUN C H, et al. HSP-YOLOv8:UAV aerial photography small target detection algorithm[J]. Drones, 2024, 8(9):453.
[24]	SUI J C, CHEN D K, ZHENG X, et al. A new algorithm for small target detection from the perspective of unmanned aerial vehicles[J]. IEEE Access, 2024, 12:29690-29697.
[25]	LUO F, BIAN W X, JIE B, et al. ARBFPN-YOLOv8:auxiliary reversible bidirectional feature pyramid network for UAV small target detection[J]. Signal,Image and Video Processing, 2025, 19(1):63.

参数	设置
epochs	300
batch	8
workers	8
imgsz	640
optimizer	SGD
lrf lr0 momentum	0.01 0.1 0.973

参数	设置
epochs	300
batch	8
workers	8
imgsz	640
optimizer	SGD
lrf lr0 momentum	0.01 0.1 0.973

模块	参数量	通道数	输出尺寸
Conv	928	64	320×320
Conv	18560	128	160×160
C2f	29056	128	160×160
Conv	73984	256	80×80
C2f	197632	256	80×80
SCDown	36096	512	40×40
C2f	788480	512	40×40
SCDown	137728	1024	20×20
C2fCIB	958464	1024	20×20
SPPF	656896	1024	20×20
PSA	990976	1024	20×20

模块	参数量	通道数	输出尺寸
Conv	928	64	320×320
Conv	18560	128	160×160
C2f	29056	128	160×160
Conv	73984	256	80×80
C2f	197632	256	80×80
SCDown	36096	512	40×40
C2f	788480	512	40×40
SCDown	137728	1024	20×20
C2fCIB	958464	1024	20×20
SPPF	656896	1024	20×20
PSA	990976	1024	20×20

模块	参数量	通道数	输出尺寸
Conv	928	64	320×320
C2f	7360	64	320×320
Conv	18560	128	160×160
C2f	49664	128	160×160
Conv	73984	256	80×80
C2f	197632	256	80×80
SCDown	36096	512	40×40
C2fCIB	249856	512	40×40
SPPF	164608	512	40×40
PSA	249728	512	40×40