欢迎访问《兵工学报》官方网站,今天是 分享到:

兵工学报 ›› 2023, Vol. 44 ›› Issue (9): 2600-2610.doi: 10.12382/bgxb.2022.1147

所属专题: 智能系统与装备技术

• • 上一篇    下一篇

基于YOLOv5的增强多尺度目标检测方法

惠康华1, 杨卫1, 刘浩翰1, 张智1,*(), 郑锦2, 百晓2   

  1. 1 中国民航大学 计算机科学与技术学院, 天津 300300
    2 北京航空航天大学 计算机学院, 北京 100191
  • 收稿日期:2022-11-30 上线日期:2023-04-10
  • 通讯作者:
  • 基金资助:
    天津市教委科研项目(2020KJ024); 中央高校基本科研业务费中国民航大学专项基金项目(3122020052); 中央高校基本科研业务费中国民航大学专项基金项目(3122015C021)

Enhanced Multi-scale Target Detection Method Based on YOLOv5

HUI Kanghua1, YANG Wei1, LIU Haohan1, ZHANG Zhi1,*(), ZHENG Jin2, BAI Xiao2   

  1. 1 College of Computer Science & Technology, Civil Aviation University of China, Tianjin 300300, China
    2 School of Computer Science and Technology, Beihang University, Beijing 100191, China
  • Received:2022-11-30 Online:2023-04-10

摘要:

针对复杂场景下初始锚框难以匹配目标及多尺度检测能力不强的问题,提出一种基于YOLOv5的增强多尺度目标检测(EM-YOLOv5)方法。通过Kmeans++聚类算法,获得适应当前检测场景下的多尺度初始化锚框,使得网络更容易捕捉到不同尺度目标;在Bottleneck结构中增加多条不同尺度的并行卷积支路,在保留原有特征信息的同时融合多尺度的特征信息,增强模型的全局感知能力。在VisDrone2019、COCO2017和PASCAL VOC2012数据集上对提出的EM-YOLOv5s模型进行测试。实验结果表明,与YOLOv5s模型相比,mAP@0.5∶0.95、mAP@0.5等关键指标均有一定提升,在PASCAL VOC2012上,mAP@0.5∶0.95提升5.2%,而检测时间仅增加1.9ms,说明EM-YOLOv5模型能够有效地提升通用复杂场景下的目标检测精度。

关键词: YOLOv5, 目标检测, 聚类算法, 多尺度卷积, 特征融合

Abstract:

To address the problem that the initial anchor box is difficult to match the target and its multi-scale detection ability is not strong in complex scenes, an enhanced multi-scale target detection method based on YOLOv5 is proposed. Through the Kmeans++ clustering algorithm, the multi-scale initialization anchors suitable for the current detection scene is obtained, which makes it easier for the network to capture targets with different scales; then, a number of parallel convolution branches with different scales are added to the Bottleneck structure. While retaining the original feature information, the multi-scale feature information is fused to enhance the global perception ability of the model. The EM-YOLOv5s model proposed is tested on VisDrone2019, COCO2017, and PASCAL VOC2012 datasets. The experimental results show that: compared with the YOLOv5s model, the key indicators such as mAP@0.5∶0.95 and mAP@0.5 are improved; on PASCAL VOC2012, mAP @0.5∶0.95 is increased by 5.2%, while the detection time is only increased by 1.9ms, indicating that EM-YOLOv5 model can effectively improve the target detection accuracy in general complex scenes.

Key words: YOLOv5 model, target detection, clustering algorithm, multi-scale convolution, feature fusion

中图分类号: