欢迎访问《兵工学报》官方网站,今天是

兵工学报 ›› 2023, Vol. 44 ›› Issue (11): 3508-3515.doi: 10.12382/bgxb.2022.1167

所属专题: 群体协同与自主技术

• • 上一篇    下一篇

基于卷积和注意力机制的小样本目标检测

郭永红1, 牛海涛1, 史超1,2,*(), 郭铖1   

  1. 1 中国兵器工业计算机应用技术研究所, 北京 100089
    2 北京理工大学 机械与车辆学院, 北京 100081

Few-shot Object Detection Based on Convolution Network and Attention Mechanism

GUO Yonghong1, NIU Haitao1, SHI Chao1,2,*(), GUO Cheng1   

  1. 1 Institute of Computer Application Technology, NORINCO Group, Beijing 100089, China
    2 School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China
  • Received:2022-11-30 Online:2023-11-07

摘要:

小样本目标检测(FSOD)旨在使检测器只用少量的训练样本就能适应未见的类别。典型的FSOD方法使用Faster R-CNN作为基本检测框架,利用卷积神经网络提取图像特征,而卷积神经网络中采用的旨在捕获尽可能多的图像信息的池化操作将不可避免地导致图像信息的丢失。在主干网络中引入混合扩张卷积,以确保更大的感受野并最大限度地减少图像信息的损失。在k-shot设置中,为充分利用给定的支持数据,提出支持特征动态融合模块,以每个支持特征和查询特征之间的相关性为权重,自适应地融合支持特征,以获得更强大的支持线索。实验结果表明,新方法在公共Pascal VOC和MS-COCO数据集上实现了较好的FSOD性能。

关键词: 小样本目标检测, 混合扩张卷积, 支持特征动态融合

Abstract:

Few-shot object detection(FSOD) aims to enable the detector with a small number of training samples. Typical FSOD method takes Faster R-CNN as the basic detection framework, and uses a convolutional neural network to extract the image features. However, the pooling operation used in the convolutional neural network inevitably leads to the loss of image information. Therefore, a hybrid dilated convolution is introduced into the backbone network to ensure a larger receptive field and minimize the loss of image information. A support feature dynamic fusion module is proposed to further utilize the given support data in k-shot setting, which adaptively fuses the support features with the weight of the correlation between each support feature and query feature to obtain stronger support clues. Experimental results show that rhe proposed method achieves good and state-of-the-art FSOD performance on public Pascal VOC and MS-COCO datasets.

Key words: few-shot object detection, hybrid dilated convolution, support feature dynamic fusion

中图分类号: