图像自寻的弹药目标检测方法综述

doi:10.12382/bgxb.2021.0610

摘要/Abstract

摘要： 弹载图像目标检测方法是实现图像自寻的弹药“发射后不管”、对目标进行自主打击的关键技术。弹药图像自寻的面临着成像环境恶劣，目标特性变化快，对算法体积、速度要求苛刻等问题。围绕弹载目标检测难点问题进行综述，将基于深度学习的目标检测方法区分为基于候选框、无候选框和基于transformer的方法，回顾了各类方法主要研究进展；对特征提取网络轻量化、预测特征图增强、非极大值抑制后处理算法、训练中样本均衡、模型压缩等弹载图像目标检测模型部署中的关键技术进行了梳理；对比了典型目标检测方法在ImageNet、COCO及弹载图像目标数据集上的性能，并对未来发展进行展望。

关键词: 弹载图像, 目标检测, 深度学习, 模型部署

Abstract: The onboard image target detection method is the key technology to realize the autonomous attack on the target by the “fire-and-forget” image homing ammunition. At present, the image homing of ammunition is faced with some problems, such as bad imaging environment, rapid change of targets' characteristics, and strict requirements for algorithm volume and speed. Firstly, the target detection methods based on deep learning are divided into methods based on anchor box, methods without anchor box and methods based on transformer, and the main technical progress of various methods is reviewed. Then, the key technologies in onboard image target detection model deployment, such as lightweight feature extraction network, enhancement of feature map for prediction, non-maximum suppression post-processing algorithm, sample equalization in training, and model compression, are studied. Finally, the performances of the typical detection algorithms on ImageNet, COCO and datasets for onboard image are compared, and the possible development in the future is looked into.

Key words: onboardimage, targetdetection, deeplearning, modeldeployment

中图分类号:

`391.4

杨传栋，钱立志，薛松，陈栋，凌冲. 图像自寻的弹药目标检测方法综述[J]. 兵工学报, 2022, 43(10): 2687-2704.

YANG Chuandong， QIAN Lizhi， XUE Song, CHEN Dong, LING Chong. Review on Target Detection of Image Homing Ammunition[J]. Acta Armamentarii, 2022, 43(10): 2687-2704.

参考文献

［1］钱立志.电视末制导炮弹武器系统关键技术研究［D］.合肥:中国科学技术大学,2006:2-6.
QIAN L Z.Study on the key technique of the TV-guided projectile weapon system［D］.Hefei: University of Science and Technology of China, 2006:2-6.(in Chinese)
［2］文苏丽, 宋怡然. DARPA导引头成本转换项目分析［J］. 战术导弹技术, 2016(2):5-9.
WEN S L, SONG Y R.Research on seeker cost transfer program of DARPA［J］.Tactical Missile Technology, 2016(2): 5-9.(in Chinese)
［3］ GIRSHICK R, DONABUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, US:IEEE, 2014: 580-587.
［4］ FELZENSZWALB P F, GIRSHICK R B, et al. Object detection with discriminatively trained part-based models［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010,32(9): 1627-1645.
［5］徐英, 谷雨, 彭冬亮，等. 面向合成孔径雷达图像任意方向舰船检测的改进YOLOv3模型［J］. 兵工学报, 2021, 42(8): 1698-1707.
XU Y, GU Y, PENG D L，et al. An improved YOLOv3 model for arbitrary-oriented ship detection in SAR image［J］. Acta Armamentarii, 2021, 42(8):1698-1707. (in Chinese)
［6］杨传栋, 刘桢, 石胜斌. 基于CNN的弹载图像目标检测方法研究［J］. 战术导弹技术, 2019(4): 85-92.
YANG C D, LIU Z, SHI S B. Research on object detection for missile-borne image based on CNN［J］. Tactical Missile Technology, 2019(4): 85-92.(in Chinese)
［7］雷鸣, 王曙光, 凌冲, 等. 基于模型压缩YOLOv4的弹载图像舰船目标实时检测［J］.兵器装备工程学报,2021,42(9):225- 230.
LEI M, WANG S G, LING C, et al. Real-time detection of ship target in missile-borne platform images based on model compressed YOLOv4［J］. Journal of Ordnance Equipment Engineering, 2021, 42(9): 225-230. (in Chinese)
［8］陈栋,田宗浩.面向深度学习的弹载图像处理异构加速现状分析［J］. 航空兵器,2021,28(3):10-17.
CHEN D, TIAN Z H.Research on heterogeneous acceleration of deep learning method for missile-borne image processing ［J］. Aero Weaponry, 2021, 28(3): 10-17. (in Chinese)
［9］侯凯强, 李俊山, 王雪博, 等. 弹载人工智能目标识别算法的嵌入式实现方法研究［J］. 制导与引信, 2019, 40(3): 40-45.
HOU K Q, LI J S, WANG X B, et al. Research on embedded implementation method of missle-borne artifical intelligence target recognition algorithms［J］. Guidance & Fuze, 2019, 40(3): 40- 45.(in Chinese)
［10］钱立志, 王曙光, 张江辉,等. 一种弹载视频图像实时消旋方法［J］. 弹箭与制导学报, 2009,29(3):26-28.
QIAN L Z, WANG S G, ZHANG J H, et al. A real-time despinning method for onboard video image［J］. Journal of Projectiles, Rockets, Missiles and Guidance, 2009,29(3):26-28.(in Chinese)
［11］ REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6):1137-1149.
［12］ LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector［C］∥Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016:21-37.
［13］ REDMON J, FARHADI A. YOLO9000: better, faster, stronger［C］］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI,US: IEEE, 2017: 6517-6525.
［14］ REDMON J, FARHADI A.Yolov3: an incremental improvement ［EB/OL］.［2018-04-08］.https:∥arxiv.org/pdf/1804.02767. pdf.
［15］ BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4: optimal speed and accuracy of object detection［EB/OL］. ［2020-04-23］.https:∥arxiv.org/pdf/2004.10934.pdf.
［16］ LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,VT,US: IEEE, 2018:8759-8768.
［17］ LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection［J］.IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 99: 2999-3007.
［18］ REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection［C］∥/Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, CA,US: IEEE, 2016: 779-788.
［19］ HUANG L, YANG Y, DDENG Y, et al.DenseBox: unifying landmark localization with end to end object detection［EB/OL］.［2015-09-16］. https:∥arxiv.org/pdf/1509. 04874v1.pdf.
［20］ TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection［EB/OL］.［2018-01-18］. https:∥ arxiv.org/abs/1904. 01355.pdf.
［21］ ZHOU X, WANG D, KRHENBHL P. Objects as points［EB/OL］. ［2019-04-25］.https:∥arxiv.org/pdf/1904.07850.pdf.
［22］ LAW H, DENG J. CornerNet: detecting objects as paired keypoints［C］∥Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018: 765-781.
［23］ DUAN K, BAI S, XIE L, et al. CenterNet: keypoint triplets for object detection［C］∥Proceeding of IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019:6568-6577.
［24］ DONG Z, LI G, LIAO Y, et al. CentripetalNet: pursuing high-quality keypoint pairs for object detection［C］∥ Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA,US: IEEE, 2020: 10516- 10525.
［25］ HEI L, YUN T, OLGA R, et al. CornerNet-Lite: efficient keypoint based object detection［EB/OL］.［2019-04-18］. https:∥arxiv.org/abs/1904. 08900.pdf.
［26］ ZHOU X, ZHUO J,KRHENBHL P. Bottom-Up object detection by grouping extreme and center points［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Long Beach, CA,US: IEEE, 2019: 850-859.
［27］ ARION N, MASSA F, SYNNAEVE G, et al. End-to-End object detection with transformers［C］∥Proceedings of the 16th European Conference on Computer Vision.Cham, Switzerland: Springer, 2020: 213-229.
［28］ FANG Y, LIAO B, WANG X, et al. You only look at one sequence: rethinking transformer in vision through object detection［EB/OL］.［2021-06-01］.https:∥arxiv.org/pdf/2106. 00666.pdf.
［29］ DAI X, CHEN Y, XIAO B, et al. Dynamic head: unifying object detection heads with attentions［EB/OL］. ［2021-06-15］.https:∥arxiv.org/pdf/2106.08322.pdf.
［30］ LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows［EB/OL］. ［2021-05-25］.https:∥arxiv.org/pdf/2103. 14030v1.pdf.
［31］ KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks［C］∥Proceedings of Advances in Neural Information Processing Systems. Lake Tahoe,CA, US: NIPS, 2012:1106-1114.
［32］ SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［J］. Computer Science, 2014, 4(1): 1-14.
［33］ SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV,US: IEEE, 2016: 2818-2826.
［34］ HOWARD A, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications［EB/OL］.［2017-04-17］.https:∥arxiv.org/pdf/ 1704.04861.pdf.
［35］ HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, CA,US: IEEE, 2016: 770-778.
［36］ XIE S, GIRSHICK R, DDLLáR P, et al.Aggregated residual transformations for deep neural networks［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI,US: IEEE, 2017: 5987-5995.
［37］ HUANG G, LIU Z, LAURENS V, et al. Densely connected convolutional networks［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI,US: IEEE, 2017: 2261-2269.
［38］ HUANG G, LIU S, VAN D, et al. CondenseNet: An efficient DenseNet using learned group convolutions［C］∥ Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, CA,US: IEEE, 2018:2752-2761.
［39］ LEE Y, PARK J. CenterMask: real-time anchor-free instance segmentation［EB/OL］.［2020-05-14］.https:∥arxiv.org/abs/ 1911.06667.pdf.
［40］ YANG L, JIANG H, CAI R, et al. CondenseNet V2: sparse feature reactivation for deep networks［EB/OL］.［2021-04-09］ https:∥arxiv.org/pdf/2104.04382.pdf.
［41］ HU J, SHEN L, SUN G, et al. Squeeze-and-excitation networks［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023.
［42］ WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module［C］∥Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018:3-19.
［43］ SELVARJU R, COGSWELL M, DDS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization［J］. International Journal of Computer Vision, 2020, 128(2):336-359.
［44］ LANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and<0.5 MB model size［EB/OL］.(2016-11-04)［2021-05-31］. https:∥arxiv. org/abs/1602. 07360.pdf.
［45］ CHOLLET F. Xception:deep learning with depthwise separable convolutions［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI,US: IEEE, 2017: 1800-1807.
［46］ SANDLER M, HOWARD A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, VT,US: IEEE, 2018: 4510-4520.
［47］ ZHANG X Y，ZHOU X Y，LIN M. ShuffleNet: an extremely efficient convolutional neural network for mobile devices［C］∥Proceedings of Conference on Computer Vision and Pattern Recognition. Salt Lake City, CA,US: IEEE, 2018: 6848-6856.
［48］ MA N, ZHANG X, ZHENG H, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design［C］∥Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018:122-138.
［49］ HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle,WA，US: IEEE, 2020:1577-1586.
［50］ HOWARD A, SANDLER M, CHU G, et al. Searching for mobilenetv3［C］∥Proceedings of IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 1314-1324.
［51］ LI Y, CHEN Y, DAI X, et al. MicroNet: improving image recognition with extremely low FLOPs［EB/OL］. ［2020-11-24］.https:∥arxiv.org/pdf/2108.05894.pdf.
［52］ ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition［C］∥Proceeding of Conference on Computer Vision and Pattern Recognition. Salt Lake City, CA,US: IEEE, 2018: 8697-8710.
［53］ WU B, DAI X, ZHANG P, et al. FBNet: hardware-aware efficient convnet design via differentiable neural architecture search［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seoul, Korea: IEEE, 2019: 10726-10734.
［54］ WAN A, DAI X, ZHANG P, et al. FBNetV2: differentiable neural architecture search for spatial and channel dimensions［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA,US: IEEE, 2020:12962-12971.
［55］ DAI X, WAN A, ZHANG P, et al. FBNetV3: joint architecture-recipe search using neural acquisition function ［EB/OL］.［2021-03-30］.https:∥arxiv.org/pdf/2006.02049.pdf.
［56］ LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI,US: IEEE, 2017:936-944.
［57］ TAN M, PANG R, LE Q. EfficientDet: scalable and efficient object detection［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle,WA, US: IEEE, 2020: 10778-10787.
［58］ GHIASI G, LIN T Y, PANG R, et al.NAS-FPN: learning scalable feature pyramid architecture for object detection［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seoul, Korea: IEEE, 2019: 7036-7045.
［59］ LIU S, HUANG D, WANG Y. Learning spatial fusion for single-shot object detection［EB/OL］.［2019-11-21］.https:∥ arxiv.org/pdf/1911.09516.pdf.
［60］ LI Q, LIN Y, HE W. SSD7-FFAM: a real-time object detection network friendly to embedded devices from scratch［J］. Applied Sciences, 2021, 11(3):1096.
［61］ LIU S, HUANG D, WANG Y. receptive field block net for accurate and fast object detection［C］∥Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018:404-419.
［62］ LI Y, CHEN Y, WANG N, et al. Scale-aware trident networks for object detection［C］∥Proceedings of IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 6053-6062.
［63］ CAO J, CHEN Q, GUO J, et al. Attention-guided context feature pyramid network for object detection［EB/OL］. ［2020-04-06］.https:∥arxiv.org/pdf/2005.11475.pdf.
［64］ CHEN Q, WANG Y, YANG T, et al. You only look one-level feature［EB/OL］.［2021-05-17］.https:∥arxiv.org/pdf/2103. 09460.pdf.
［65］ SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, CA,US:IEEE, 2016:761-769.
［66］ CAO Y, CHEN K, LOY C C, et al.Prime sample attention in object detection［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Long Beach,CA, US:IEEE, 2019: 11580-11588.
［67］ ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle, WA,US:IEEE, 2020:9756-9765.
［68］ LI B, LIU Y, WANG X.Gradient harmonized single-stage detector［C］∥Proceedings of AAAI Conference on Artificial Intelligence. Honolulu,HI,US: IEEE, 2019: 8577-8584.
［69］ LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection［C］∥Proceedings of Advances in Neural Information Processing Systems. US: NIPS, 2020: 21002-21012.
［70］ CHEN K, LI J, LIN W, et al. Towards accurate one-stage object detection with AP-loss［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA,US:IEEE, 2019: 5119-5127.
［71］ CHEN J, LIU D, LUO B, et al. Residual objectness for imbalance reduction［EB/OL］.［2019-08-24］.https:∥arxiv. org/pdf/1908.09075.pdf.
［72］ BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS-improving object detection with one line of code［C］∥Proceedings of IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017:5562-5570.
［73］ JIANG B, LUO R, MAO J, et al. Acquisition of localization confidence for accurate object detection［C］∥Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018:816-832.
［74］ WU S, LI X, WANG X. IoU-aware single-stage object detector for accurate localization［J］. Image and Vision Computing, 2020, 97: 103911.
［75］ LIU S, HUANG D, WANG Y. Adaptive NMS: refining pedestrian detection in a crowd［C］∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA,US: IEEE, 2019, 6452-6461.
［76］ ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression［C］∥ Proceedings of AAAI Conference on Artificial Intelligence. New York, NY,US: AAAI, 2020.
［77］ BOLYA D, ZHOU C, XIAO F, et al. YOLACT: real-time instance segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019,99:1-1.
［78］ ZHENG Z, WANG P, REN D, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation［EB/OL］.［2020-05-07］.https:∥arxiv. org/pdf/2005.03572v3.pdf.
［79］ HE Y, LIN J, LIU Z, et al. AMC: autoML for model compression and acceleration on mobile devices［EB/OL］. ［2020-04-06］.https:∥arxiv.org/pdf/1802.03494.pdf.
［80］ HOROWITZ M.Computing's energy problem (and what we can do about it)［C］∥Proceedings of IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). San Francisco, CA,US: IEEE, 2014:10-14.



[1]	苏迪, 王少博, 张成, 陈志升, 刘超越. 基于生成对抗网络的弹载图像盲去模糊算法[J]. 兵工学报, 2024, 45(3): 855-863.
[2]	熊光明, 罗震, 孙冬, 陶俊峰, 唐泽月, 吴超. 基于红外相机和毫米波雷达融合的烟雾遮挡无人驾驶车辆目标检测与跟踪[J]. 兵工学报, 2024, 45(3): 893-906.
[3]	田大明, 苗圃. 融合模型求解与深度学习的可见光通信非线性均衡器[J]. 兵工学报, 2024, 45(2): 466-473.
[4]	张堃, 杜睿怡, 时昊天, 华帅. 基于Mogrifier-BiGRU的飞行器轨迹预测[J]. 兵工学报, 2024, 45(2): 373-384.
[5]	杨家铭, 潘悦, 王强, 曹怀刚, 高荪培. 水下弱目标跟踪的深度学习方法研究[J]. 兵工学报, 2024, 45(2): 385-394.
[6]	秦昊林, 许廷发, 李佳男. 基于超像素注意力和孪生结构的半监督高光谱显著性目标检测[J]. 兵工学报, 2023, 44(9): 2639-2649.
[7]	惠康华, 杨卫, 刘浩翰, 张智, 郑锦, 百晓. 基于YOLOv5的增强多尺度目标检测方法[J]. 兵工学报, 2023, 44(9): 2600-2610.
[8]	彭沛然, 任术波, 李佳男, 周鸿伟, 许廷发. 基于光照感知的多光谱融合行人检测方法[J]. 兵工学报, 2023, 44(9): 2622-2630.
[9]	周宇, 曹荣刚, 栗苹, 马啸. 一种用于外场试验图像的引信炸点检测方法[J]. 兵工学报, 2023, 44(8): 2453-2464.
[10]	王学敏, 于洪波, 张翔宇, 安舒, 李文海. 基于Hough变换检测前跟踪的水下多目标被动检测方法[J]. 兵工学报, 2023, 44(7): 2114-2121.
[11]	柳嵩, 姚直象, 陆代强, 袁骏. 一种基于被动声呐宽带空间谱的自动检测算法[J]. 兵工学报, 2023, 44(6): 1764-1774.
[12]	马鹏阁, 魏宏光, 孙俊灵, 陶然, 庞栋栋, 单涛, 蔡志勇, 刘兆瑜. 基于高斯-拉普拉斯滤波的增强局部对比度红外小目标检测算法[J]. 兵工学报, 2023, 44(4): 1041-1049.
[13]	孙强, 张伽伟, 喻鹏. 基于海洋浮标的电场干扰特性分析及信号检测方法[J]. 兵工学报, 2023, 44(3): 857-864.
[14]	张良安, 陈洋, 谢胜龙, 刘同鑫. 基于机器视觉与深度学习的飞机防护栅裂纹检测系统[J]. 兵工学报, 2023, 44(2): 507-516.
[15]	王洋, 冯永新, 宋碧雪, 田秉禾. DP-DRCnet卷积神经网络信号调制识别算法[J]. 兵工学报, 2023, 44(2): 545-555.