基于光照感知的多光谱融合行人检测方法

doi:10.12382/bgxb.2022.1114

摘要/Abstract

摘要：

多光谱行人检测在智能安防、自动驾驶等领域得到广泛应用。在光照较弱或存在遮挡的情况下,行人检测的准确性和鲁棒性仍然面临挑战。为解决这个问题,提出一种新的光照感知跨光谱融合行人检测网络。该网络利用交叉注意力和光照感知机制来充分利用多光谱特异性特征,以提高行人检测的鲁棒性和准确性。为增强两光谱之间特征表达,引入交叉注意力模块。此外提出一个光照感知子网络,它能够根据可见光和红外光谱的光照强度变化自适应地选择有效的光谱特征信息,从而提高检测系统的鲁棒性。在两个主流的多光谱行人数据集上进行了实验。实验结果显示,新方法在检测精度和检测速度方面都优于现有方法,所得成果对于提高行人检测模型的鲁棒性和泛用性具有重要意义,并在实际应用中具有广泛的潜力。

关键词: 多光谱融合检测, 行人检测, 深度学习, 注意力机制

Abstract:

Multispectral pedestrian detection has been widely applied in scenarios such as intelligent security and autonomous driving. However, the accuracy and robustness of pedestrian detection still face challenges, especiallyin low-light conditions or in scenarios with occlusions. To address this issue, a novel pedestrian detection network is proposed, which is namedillumination-aware cross-spectral fusion network. Thenetwork leverages cross-attention and illumination-aware mechanisms to fully exploitmulti-spectral specific features, thereby improving the robustness and accuracy of pedestrian detection. To enhance feature representation between the two spectra, a cross-attention module is introduced. Additionally, an illumination-aware sub-network is proposed, which adaptively selects effective spectral feature information based on the illumination intensity variations of visible and infrared spectra, thusimproving the robustness of the detection system. Experiments areconducted on two multi-spectral pedestrian detection datasets, the KAIST dataset and the CVC-14 dataset. The experimental results demonstratethat theproposed method outperforms existing methods in terms of detection accuracy and speed. This achievementis of significant importance for enhancing the robustness and versatility of pedestrian detection models,with broad potential for practical applications.

Key words: multispectral fusion detection, pedestrian detection, deep learning, attention mechanism

中图分类号:

TJ301

彭沛然, 任术波, 李佳男, 周鸿伟, 许廷发. 基于光照感知的多光谱融合行人检测方法[J]. 兵工学报, 2023, 44(9): 2622-2630.

PENG Peiran, REN Shubo, LI Jianan, ZHOU Hongwei, XU Tingfa. Illumination-aware Multispectral Fusion Network for Pedestrian Detection[J]. Acta Armamentarii, 2023, 44(9): 2622-2630.

图/表 10

参考文献 26

[1]	李博, 王博, 韩京冶, 等. 基于车载计算机的红外图像移动目标检测[J]. 兵工学报, 2022, 43(增刊1): 66-73.
	LI B, WANG B, HAN J Y, et al. Infrared image moving object detection technology based on onboard computer[J]. Acta Armamentarii, 2022, 43(S1): 66-73. (in Chinese) doi: 10.12382/bgxb.2022.A012
[2]	BILAL M, KHAN A, KHAN M U K, et al. A low-complexity pedestrian detection framework for smart video surveillance systems[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 27(10): 2260-2273. doi: 10.1109/TCSVT.2016.2581660 URL
[3]	PAIS G D, DIAS T J, NASCIMENTO J C, et al. OmniDRL: robust pedestrian detection using deep reinforcement learning on omnidirectional cameras[C]//Proceedings of 2019 International Conference on Robotics and Automation.Washington,D.C.,US:IEEE, 2019: 4782-4789.
[4]	PAREKH D, PODDAR N, RAJPURKAR A, et al. A review on autonomous vehicles: progress, methods and challenges[J]. Electronics, 2022, 11(14): 2162. doi: 10.3390/electronics11142162 URL
[5]	QIU R, XU M, YAN Y Y, et al. A methodology review on multi-view pedestrian detection[J]. Recent Advancements in Multi-View Data Analytics, 2022: 317-339.
[6]	HWANG S, PARK J, KIM N, et al. Multispectral pedestrian detection: benchmark dataset and baseline[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington,D.C.,US:IEEE, 2015: 1037-1045.
[7]	KONIG D, ADAM M, JARVERS C, et al. Fully convolutional region proposal networks for multispectral person detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.Washington,D.C.,US:IEEE, 2017: 49-56.
[8]	ZHANG H, FROMONT E, LEFEVRE S, et al. Multispectral fusion for object detection with cyclic fuse-and-refine blocks[C]//Proceedings of 2020 IEEE International Conference on Image Processing. Washington,D.C.,US:IEEE, 2020: 276-280.
[9]	LI C Y, SONG D, TONG R F, et al. Illumination-aware faster R-CNN for robust multispectral pedestrian detection[J]. Pattern Recognition, 2019, 85: 161-171. doi: 10.1016/j.patcog.2018.08.005 URL
[10]	ZHANG L, ZHU X Y, CHEN X Y, et al. Weakly aligned cross-modal learning for multispectral pedestrian detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul, Korea:IEEE, 2019: 5127-5137.
[11]	KIEU M, BAGDANOV A D, BERTINI M, et al. Task-conditioned domain adaptation for pedestrian detection in thermal imagery[C]//Proceedings of European Conference on Computer Vision. Cham, Germany:Springer, 2020: 546-562.
[12]	LIU J J, ZHANG S T, WANG S, et al. Multispectral deep neural networks for pedestrian detection:arXiv:1611.02644[R]. Ithaca,NY, US: Cornell University, 2016:1611.02644.
[13]	WAGNER J, FISCHER V, HERMAN M, et al. Multispectral pedestrian detection using deep fusion convolutional neural networks[C]//Proceedings of the 24th European Symposium on Artificaial Neural Networks, Computational Intelligence and Machine Learning.Bruges, Belgium:ESANN, 2016, 587: 509-514.
[14]	ZHOU K L, CHEN L S, CAO X. Improving multispectral pedestrian detection by addressing modality imbalance problems[C]//Proceedings of European Conference on Computer Vision.Cham, Germany: Springer, 2020: 787-803.
[15]	DOLLAR P, APPEL R, BELONGIE S, et al. Fast feature pyramids for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8): 1532-1545. doi: 10.1109/TPAMI.2014.2300479 pmid: 26353336
[16]	ZHANG L, LIU Z Y, ZHANG S F, et al. Cross-modality interactive attention network for multispectral pedestrian detection[J]. Information Fusion, 2019, 50: 20-29. doi: 10.1016/j.inffus.2018.09.015 URL
[17]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30: 13-22.
[18]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,US:IEEE, 2018: 7132-7141.
[19]	FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach, CA, US:IEEE, 2019: 3146-3154.
[20]	HUANG L, WANG W M, CHEN J, et al. Attention on attention for image captioning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul, Korea:IEEE, 2019: 4634-4643.
[21]	LI X, WANG W H, HU X L, et al. Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington,D.C.,US:IEEE, 2019: 510-519.
[22]	WOO S, PARK J, LEE J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision. 2018:3-19.DOI:10.1007978-3-030-01234-2_1.
[23]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Washington,D.C.,US:IEEE, 2017: 2980-2988.
[24]	LI C Y, SONG D, TONG R F, et al. Multispectral pedestrian detection via simultaneous detection and segmentation:arXiv: 1808.04818[R]. Ithaca,NY, US: Cornell University, 2018:1808.04818.
[25]	GONZALEZ A, FANG Z, SOCARRAS Y, et al. Pedestrian detection at day/night time with visible and FIR cameras: a comparison[J]. Sensors, 2016, 16(6): 820. doi: 10.3390/s16060820 URL
[26]	PARK K, KIM S, SOHN K. Unified multi-spectral pedestrian detection based on probabilistic fusion networks[J]. Pattern Recognition, 2018, 80: 143-155. doi: 10.1016/j.patcog.2018.03.007 URL

方法	所有	白天	夜晚	近距离	中距离	远距离	无遮挡	部分遮挡	重度遮挡
ACF^[15]	47.32	42.57	56.17	28.74	53.67	88.20	62.94	81.40	88.08
Halfway Fusion ^[8]	25.75	24.88	26.59	8.13	30.34	75.70	43.13	65.21	74.36
Fusion RPN+BN^[7]	18.29	19.57	16.27	0.04	30.87	88.86	47.75	56.10	72.20
MSDS-RCNN^[24]	11.63	10.60	13.73	1.29	16.19	63.73	29.86	38.71	63.37
IAF-RCNN ^[9]	15.73	14.55	18.26	0.96	25.54	77.84	40.17	48.40	69.76
CIAN^[16]	14.12	14.77	11.13	3.71	19.04	55.82	30.31	41.57	62.48
AR-CNN^[10]	9.34	9.94	8.38	0.00	16.08	69.00	31.40	38.63	55.73
MBNet ^[14]	8.13	8.28	7.86	0.00	16.07	55.99	27.74	35.43	59.14
本文方法	7.62	8.51	5.99	0.01	12.18	48.14	24.15	27.84	52.64

方法	所有	白天	夜晚	近距离	中距离	远距离	无遮挡	部分遮挡	重度遮挡
ACF^[15]	47.32	42.57	56.17	28.74	53.67	88.20	62.94	81.40	88.08
Halfway Fusion ^[8]	25.75	24.88	26.59	8.13	30.34	75.70	43.13	65.21	74.36
Fusion RPN+BN^[7]	18.29	19.57	16.27	0.04	30.87	88.86	47.75	56.10	72.20
MSDS-RCNN^[24]	11.63	10.60	13.73	1.29	16.19	63.73	29.86	38.71	63.37
IAF-RCNN ^[9]	15.73	14.55	18.26	0.96	25.54	77.84	40.17	48.40	69.76
CIAN^[16]	14.12	14.77	11.13	3.71	19.04	55.82	30.31	41.57	62.48
AR-CNN^[10]	9.34	9.94	8.38	0.00	16.08	69.00	31.40	38.63	55.73
MBNet ^[14]	8.13	8.28	7.86	0.00	16.07	55.99	27.74	35.43	59.14
本文方法	7.62	8.51	5.99	0.01	12.18	48.14	24.15	27.84	52.64

检测方法	漏检率/%
检测方法	所有	白天	夜晚
MACF^[26]	69.71	72.63	65.43
Halfway Fusion ^[26]	31.99	36.29	26.29
Park et al^[26]	26.29	28.67	23.48
AR-CNN ^[10]	22.10	24.70	18.10
MBNet^[14]	21.10	24.70	13.50
本文方法	21.03	24.84	12.97

检测方法	漏检率/%
检测方法	所有	白天	夜晚
MACF^[26]	69.71	72.63	65.43
Halfway Fusion ^[26]	31.99	36.29	26.29
Park et al^[26]	26.29	28.67	23.48
AR-CNN ^[10]	22.10	24.70	18.10
MBNet^[14]	21.10	24.70	13.50
本文方法	21.03	24.84	12.97

检测方法	总漏检率/%	检测速度/ (幅·ms^-1)	硬件平台
ACF^[15]	47.32	2730	TITAN X
Halfway Fusion^[8]	25.75	430	TITAN X
Fusion RPN+BN^[7]	18.29	800	TITAN X
MSDS-RCNN^[24]	11.63	220	TITAN X
IAF-RCNN^[9]	15.73	210	TITAN X
CIAN^[16]	14.12	70	GTX 1080Ti
AR-CNN^[10]	9.34	120	GTX 1080Ti
MBNet^[14]	8.13	70	GTX 1080Ti
本文方法	7.62	70	GTX 1080Ti