
浏览全部资源
扫码关注微信
1. 长春理工大学 电子信息工程学院,吉林,长春,130013
2. 中国电子科技集团公司 第五十三研究所,天津,300308
3. 长春理工大学 机电工程学院,吉林,长春,130013
Received:24 November 2025,
Online First:05 April 2026,
移动端阅览
赵非玉,丁红昌,王子康,等. 事件域与RGB域跨模态融合的仿生目标跟踪算法[J/OL]. 兵工学报, 2026(2026-04-05). https://doi.org/10.12382/bgxb.2025.1021.
ZHAO F Y, DING H C, WANG Z K, et al. Bionic object tracking algorithm based on cross modal fusion of event domain and rgb domain[J/OL]. Acta Armamentarii, 2026(2026-04-05). https://doi.org/10.12382/bgxb.2025.1021. (in Chinese)
赵非玉,丁红昌,王子康,等. 事件域与RGB域跨模态融合的仿生目标跟踪算法[J/OL]. 兵工学报, 2026(2026-04-05). https://doi.org/10.12382/bgxb.2025.1021. DOI:
ZHAO F Y, DING H C, WANG Z K, et al. Bionic object tracking algorithm based on cross modal fusion of event domain and rgb domain[J/OL]. Acta Armamentarii, 2026(2026-04-05). https://doi.org/10.12382/bgxb.2025.1021. (in Chinese) DOI:
针对复杂恶劣环境下单一模态视觉跟踪鲁棒性差的问题,提出一种事件域与RGB域跨模态融合的仿生目标跟踪框架。根据事件相机与RGB相机的互补特性,搭建一个包含仿生RGB模块、事件模块、跨模态融合模块、分类-回归模块的可适应复杂场景的跟踪框架。实验结果表明,算法在VisEvent数据集上精度、成功率达61.2、78.1,在FE108数据集上精度、成功率达62.1、92.3,跟踪速率为63帧/s。与单模态跟踪方法相比,可有效抑制高速运动、过暗过曝、相似物干扰等问题。该仿生跟踪框架充分融合双模态感知优势,显著提升复杂环境下的跟踪鲁棒性,更符合全天候、多场域的目标跟踪任务需求。
Toaddress the poor robustness of single-model visual tracking in complex and harsh environments
a bionic object tracking framework based on cross-model fusion of event domain and RGB domain is proposed.Leveraging the complementary characteristics of event cameras and RGB cameras
a tracking framework adaptable to complex scenes is constructed
which consists of a bionic RGB module
an event module
a cross-modal fusion module
and a classification-regression module. Experimental results demonstrate that the proposed algorithm achieves precision and success rates of 61.2 and 78.1 on the VisEvent dataset
and 62.1 and 92.3 on the FE108 dataset
with a tracking speed of 63 FPS. Compared with single-modal tracking methods
it can effectively alleviate issues such as high-speed motion
underexposure
overexposure
and similar object interference. The proposed bionic tracking framework fully integrates the advantages of dual-modal perception
significantly improves tracking robustness in complex environments
and better meets the requirements of all-day and multi-scenario object tracking tasks.
LIU S, LIU D Y, SRIVASTAVA G, et al. Overview and methods of correlation filter algorithms in object tracking[J]. Complex & Intelligent Systems, 2021, 7: 1895-1917.
姚佳志, 宋延嵩, 宋建林, 等. 滚仰式导引头跟踪策略研究[J]. 兵工学报, 2024, 45(11):4031-4038.
YAO J Z, SONG Y S, SONG J L, et al. Research on tracking strategy of roll-pitch seeker[J]. Acta Armamentarii, 2024, 45(11): 4031-4038. (in Chinese)
杨绪祺, 谭启凡, 苏航, 等. 面向无人机视觉制导的自适
应目标跟踪方法[J]. 兵工学报, 2025, 46(2): 240284.
YANG X Q, TAN Q F, SU H, et al. Guidance-tracker:an adaptive uav siamese tracker for visual guidance [J]. Acta Armamentarii, 2025, 46(2): 240284. (in Chinese)
张忠民, 叶聪. 基于改进ByteTrack与YOLOv10的无
人机多目标跟踪算法[J]. 兵工学报,2026, 47(1):250609.
ZHANG Z M, YE C.Drone multi-object tracking algorithm based on improved bytetrack and YOLOv10[J]. Acta Armamentarii,2026, 47(1):250609. (in Chinese)
LICHTSTEINER P, POSCH C, DELBRUCK T. A 128
×128 120 dB 15 µs latency asynchronous temporal contrast vision sensor[J]. IEEE Journal of Solid-State Circuits, 2008, 43(2):566-576.
ZHU X F, XU T Y, TANG Z Y, et al. RGBD1K: a large-scale dataset and benchmark for RGB-D object tracking[C]//Proceedings of the 37th AAAI Conference on Artificial Intelligence Washington D.C., US: AAAI, 2023, 37(3): 3870-3878.
ZHANG J Q, WANG Y C, LIU W X, et al. Frame-event alignment and fusion network for high frame rate tracking [C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver,QC, Canada:IEEE, 2023:9781-9790.
ZHU Z Y, HOU J H, WU D P, et al. Cross-modal orthogonal high-rank augmentation for RGB-event transformer-trackers[C]//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision.Paris, France:IEEE, 2023: 21988-21998.
HE K M, CHEN X L, XIE S N, et al. Masked autoencoders are scalable vision learners[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans, LA, US:IEEE, 2022: 15979-15988.
ZHU J W, LAI S M, CHEN X, et al. Visual prompt multi-modal tracking[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, QC, Canada:IEEE, 2023: 9516-9526.
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]//Proceedings of 2021 International Conference on Learning Representations. Vienna, Austria: OpenReview.net, 2021:1-16.
GU A, GOEL K, RE C. Efficiently modeling long sequences with structured state spaces[C]//Proceedings of the tenth International Conference on Learning Repre-sentations. virtual Event: OpenReview.net, 2022:1-15.
CHEN X, YAN B, ZHU J W, et al. Transformer tracking [C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville,TN, US:IEEE, 2021: 8122-8131.
ZHOU L H, LIAO B C, ZHANG Q, et al. Vision Mamba: efficient visual representation learning with bidirectional state space model[C]//Proceedings of the 41st International Conference on Machine Learning. Vienna, Austria:ACM, 2024, 235:62429-62442.
HATAMIZADEH A, KAUTZ J. MambaVision:a hybrid mamba-transformer vision backbone[C]//Proceedings of the the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans, LA, US:IEEE,2025:33047-33056.
LI Y, XING Y, LAN X Y, et al. AlignMamba emhancing multimodal mamba with local and global cross-modal alignment[C]//Preceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans, LA, US:IEEE, 2025:33680-33689.
OverLoCK: an overview-first-look-closely-next convnet with context-mixing dynamic kernels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, US:IEEE, 2025:12345-12355.
YIN D S, HU L Y, LI B, et al. 5%>100%: breaking performance shackles of full fine-tuning on visual recognition tasks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville,GN, US:IEEE, 2025: 18576-18586.
HU L, YIN D, WANG Q, et al. Mona:multicognitive visual adapter for parameter-efficient fine-tuning[J]. IEEE TPAMI, 2025, 47(3):1023-1035.
WANG X, LI J N, ZHU L, et al. VisEvent: reliable object tracking via collaboration of frame and event flows[J]. IEEE Transactions on Cybernetics, 2024:54(3):1997-2010.
ZHANG J Q, YANG X, FU Y K, et al. Object tracking by jointly exploiting frame and event domain[C]// Proceedings of the IEEE International Conference on Computer Vision. Montreal, Quebec, Canada:IEEE, 2021:1234-1242.
ZHANG J Q, DONG B, ZHANG H W, et al. Spiking transformers for event-based single object tracking[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference. New Orleans, LA, US:IEEE, 2022: 435-443.
SZAGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition, Boston,MA, US:IEEE, 2015: 1-9.
BA J L, KIROS J R, HINTON G E. Layer normaliza- tion:arXiv: 1607.06450[R].Ithaca,NY,US:Cornell University, 2016:1607.06450.
HU J, SHEN L, SUN G, et al. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognitio. Salt Lake City, US:IEEE, 2018:7132-7141.
WANG W H, XIE E Z, LI X, et al. Pvt v2: improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3): 415-424.
CAO Y H, CHEN K, LOY C C, et al. Prime sample attention in object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, US:IEEE, 2020: 11580-11588.
WU S K, YANG J R, WANG X G, et al. IOU-balanced loss functions for single-stage object detection[J]. Pattern Recognition Letters, 2022, 156:96-103.
LI X, WANG W H, WU L J, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection:arXiv:21002-21012[R]. Ithaca,NY,US:Cornell University, 2020: 21002-21012.
ZHANG Z P, PENG H W, FU J L, et al. Ocean: object-aware anchor-free tracking[C]//Proceedings of the European Conference on Computer Vision. Glasgow, U.K., 2020:766-782.
LI H, WU Z, LIU Y, et al. Online distillation for robust object tracking[J]. IEEE Transactions on Image Process-ing, 2024,33: 2105-2118.
TANG C, WANG X, HUANG J, et al. OmniEvent: a unified framework for event-only representation learning[C]//AAAI Conference on Artificial Intelligence, Vancouver,QC, Canada, 2024:1-9.
CAO Z, FU C, YE B, et al. ViPT: Visual prompt tuning for template-free rgb-event tracking[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, QC, Canada:IEEE, 2023:1571-1580.
SHAO P C, XU T Y, TANG Z Y, et al. TENet: targetness entanglement incorporating with multi-scale pooling and mutually-guided fusion for RGB-E object tracking:arXiv:2405.05004[R]. Ithaca,NY,US:Cornell University, 2024:2405.05004.
TANG C M, WANG X, HUANG J, et al. Revisiting color-event based tracking: a unified network, dataset, and metric:arXiv:2211. 11010 [R].Ithaca,NY,US:Cornell University, 2022:2211. 11010.
HUANG J, WANG S A, WANG S, et al. Mambafetrack: frame-event tracking via state space model[C]//Proceedings of Chinese Conference on Pattern Recognition and Computer Vision. Chengdu, China:Springer, 2024:3-18.
0
Views
0
下载量
0
CNKI被引量
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024360号