DAI Wenjun, CHANG Tianqing, CHU Kaixuan, ZHANG Lei, GUO Libin. Video Object Detection Method for Tank Fire Control System Based on Spatial-temporal Convolution Feature Memory Model[J]. Acta Armamentarii, 2020, 41(9): 1708-1718.
ZOU Z X,SHI Z W,GUO Y H,et al. Object detection in 20 years: a survey[EB/OL]. (2019-05-16)[2019-11-19]. https:∥arxiv.org/abs/1905.05055.[2] 章毓晋.中国图像工程:2018[J].中国图象图形学报,2019,24(5): 665-676.ZHANG Y J.Image engineering in China:2018[J].Journal of Image and Graphics,2019,24(5): 665-676. (in Chinese)[3] 张顺,龚怡宏,王进军. 深度卷积神经网络的发展及其在计算机视觉领域的应用[J].计算机学报,2019,42(3):3-32.ZHANG S,GONG Y H,WANG J J.The development of deep convolution neural network and its applications on computer vision[J].Chinese Journal of Computers,2019,42(3):3-32. (in Chinese)[4] 朱竞夫,赵碧君,王钦钊.现代坦克火控系统[M].北京:国防工业出版社,2003.ZHU J F,ZHAO B J,WANG Q Z.Modern tank fire control system[M].Beijing: National Defense Industry Press,2003.(in Chinese)[5] 孙皓泽,常天庆,王全东,等.一种基于分层多尺度卷积特征提取的坦克装甲目标图像检测方法[J].兵工学报,2017,38(9): 1681-1691.SUN H Z,CHANG T Q,WANG Q D,et al. Image detection method for tank and armored targets based on hierarchical multi-scale convolution feature extraction[J].Acta Armamentarii,2017,38(9): 1681-1691. (in Chinese)[6] 王全东,常天庆,张雷,等.面向多尺度坦克装甲车辆目标检测的改进 Faster R-CNN 算法[J].计算机辅助设计与图形学学报,2018,30(12): 2278-2291.WANG Q D,CHANG T Q,ZHANG L,et al. An improved Faster R-CNN algorithm for detection of multi-scale tank armored vehicle targets[J].Journal of Computer-Aided Design & Computer Graphics,2018,30(12): 2278-2291.(in Chinese)[7] RUSSAKOVSKY O,DENG J,SU H,et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision,2015,115(3): 211-252.[8] KANG K,LI H,XIAO T,et al. Object detection in videos with tubelet proposal networks[C]∥Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,US: IEEE,2017: 727-735.[9] MASON L, MENGLOWG Z, MARIE W,et al. Looking fast and slow:memory-guided mobile video object detection [EB/OL].(2019-03-25) [2019-11-19].https:∥arxiv. org/abs/1903. 10172.[10] DENG J J,PAN Y W,YAO T,et al. Relation distillation networks for video object detection[C]∥Proceedings of the IEEE International Conference on Computer Vision. Seoul,Korea: IEEE,2019: 7023-7032.[11] KANG K,LI H S,YAN J J,et al. T-CNN: tubelets with convolutional neural networks for object detection from videos[J].IEEE Transactions on Circuits and Systems for Video Technology,2017,28(10): 2896-2907.[12] FEICHTENHOFER C,PINZ A,ZISSERMAN A. Detect to track and track to detect[C]∥Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE,2017:3038-3046.[13] WEI H,POOYA K, TOM L P,et al. Seq-NMS for video object detection [EB/OL].(2016-08-22) [2019-11-19]. https:∥arxiv.org/abs/1602.08465.[14] DOSOVITSKIY A,FISCHER P,ILG E,et al. FlowNet: learning optical flow with convolutional net-works[C]∥Proceedings of the IEEE International Conference on Computer Vision. Santiago,Chile: IEEE,2015:2758-2766.[15] ZHU X Z,WANG Y J,DAI J F,et al. Flow-guided feature aggregation for video object detection[C]∥Proceedings of the IEEE International Conference on Computer Vision. Venice,Italy: IEEE,2017: 408-417.[16] LIPTON Z C,ZACHARY J,ELKAN C. A critical review of recurrent neural networks for sequence learning [EB/OL]. (2015-10-17) [2019-11-19].https:∥arxiv.org/abs/1506.00019.[17] DONAHUE J,ANNE H L,GUADARRAMA S,et al. Long-term recurrent convolutional networks for visual recognition and description[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston,MA,US: IEEE,2015: 2625-2634.[18] KYUNGHYUN C,BART V M,CAGLAR G,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation [EB/OL].(2014-09-03) [2019-11-19].https:∥arxiv.org/abs/1406.1078.[19] SHI X J,CHEN Z R,WANG H,et al.Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]∥Advances in Neural Information Processing Systems.Cambridge,MA,US:MIT Press,2015:802-810.[20] NICOLAS B,LI Y,CHRIS P,et al. Delving deeper into convolutional networks for learning video representations [EB/OL]. (2016-03-01) [2019-11-19].https:∥arxiv.org/abs1511.06432.[21] LU Y,LU C,TANG C K. Online video object detection using association LSTM[C]∥Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE,2017:2344-2352.[22] LIU W,ANGUELOV D,ERHAN D,et al. SSD: single shot multibox detector[C]∥Proceedings of the European Conference on Computer Vision.Amsterdam,the Netherlands: Springer,2016: 21-37.[23] XIAO F,JAE L Y.Video object detection with an aligned spatial-temporal memory[C]∥Proceedings of the European Conference on Computer Vision.Munich,Germany:Springer,2018: 485-501.[24] DAI J F,QI H,XIONG Y,et al. Deformable convolutional networks[C]∥Proceedings of the IEEE International Conference on Computer Vision. Venice,Italy: IEEE,2017:764-773.[25] HE K M,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas ,NV,US: IEEE,2016: 770-778.[26] DAI J F,LI Y,HE K M,et al. R-FCN: object detection via region-based fully convolutional networks[C]∥Proceedings of Conference and Workshop on Neural Information Processing Systems. Cambridge,MA,US: MIT Press,2016:379-387.[27] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2015-04-10)[2019-11-19]. https:∥arxiv.org/abs/1409.1556.[28] SZEGEDY C,LIU W,JIA Y,et al. Going deeper with convolutions[C]∥Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston,MA,US:IEEE,2015: 1-9.[29] WANG S Y,ZHOU Y C,YAN J J,et al.Fully motion-aware network for video object detection[C]∥Proceedings of the European Conference on Computer Vision.Munich,Germany:Springer,2018:557-573.