
浏览全部资源
扫码关注微信
1. 南京理工大学自动化学院,江苏,南京,210094
2. 南京理工大学能源与动力工程学院,江苏,南京,210094
Received:09 September 2025,
Online First:23 April 2026,
移动端阅览
郑颖,马国梁,郭健,等. 基于前序特征提示的单目标跟踪算法[J/OL]. 兵工学报, 2026(2026-04-23). https://doi.org/10.12382/bgxb.2025.0829.
ZHENG Y, MA G L, GUO J, et al. Single object tracking algorithm based on preceding-feature prompts[J/OL]. Acta Armamentarii, 2026(2026-04-23). https://doi.org/10.12382/bgxb.2025.0829. (in Chinese)
郑颖,马国梁,郭健,等. 基于前序特征提示的单目标跟踪算法[J/OL]. 兵工学报, 2026(2026-04-23). https://doi.org/10.12382/bgxb.2025.0829. DOI:
ZHENG Y, MA G L, GUO J, et al. Single object tracking algorithm based on preceding-feature prompts[J/OL]. Acta Armamentarii, 2026(2026-04-23). https://doi.org/10.12382/bgxb.2025.0829. (in Chinese) DOI:
现有单目标跟踪算法普遍依赖初始帧模板与当前帧搜索区域之间的相似性度量进行匹配,在复杂场景中易出现跟踪漂移和误匹配问题,为此提出一种基于前序特征提示的Transformer目标跟踪算法。通过广义关系建模模块对搜索区域内的目标与背景令牌进行自适应划分,仅允许目标令牌与模板令牌交互以有效抑制相似目标及环境噪声干扰;在Transformer目标跟踪算法框架中新增加了前序特征提示模块,结合令牌划分结果与当前帧预测结果生成精确的目标蒙版,并利用前序特征提示编码器对蒙版与搜索图像的拼接结果进行联合编码;设计的前序特征提示解码器根据当前帧搜索区域特征自适应聚合编码结果生成特定提示,从而提升了跟踪算法的适应能力。采用多个基准数据集进行测试。研究结果表明:所提算法在GOT-10K数据集上平均重叠率达到76.1%,在LaSOT数据集上成功率曲线下面积为72.3%、精准度为82.4%;所提算法跟踪性能较优,能够有效解决单目标跟踪时的遮挡、形变和相似干扰等问题。
Most existing single-object tracking algorithms rely on similarity matching between the initial-frame template and the search region of the current frame. However
in complex scenarios
this often leads to tracking drift and mismatches. To address these issues
we propose a Transformer-based tracking algorithm enhanced with preceding-feature prompts. A generalized relation modeling module adaptively partitions target and background tokens within the search region
allowing only target tokens to interact with template tokens. This effectively suppresses interference from similar objects and environmental noise. Within the Transformer tracking framework
we introduce apreceding-featureprompt module that combines token partitioning results with the current prediction to generate precise target masks. These masks are jointly encoded with the search image by a prior feature prompt encoder. A corresponding decoder then adaptively aggregates the encoded information with search-region features to produce task-specific prompts
thereby improving adaptability. Evaluations on multiple benchmark datasets show that our method achieves an average overlap of 76.1% on GOT-10Kdataset
an area-under-curve of 72.3%
and a precision of 82.4% on LaSOTdataset. The results demonstrate superior tracking performance
particularly in handling occlusion
deformation
and interference from similar objects.
ZHAO G C, MENG F Y, YANG C Z, et al. A review of object tracking based on deep learning[J]. Neurocomputing, 2025, 651: 130988.
丁奇帅,雷帮军,吴正平. 基于孪生网络的轻量型无人机单目标跟踪算法[J]. 航空学报, 2025, 46(4): 330925.
DING Q S, LEI B J, WU Z P. A lightweight single object tracking algorithm for UAVs based on Siamese network[J]. Acta Aeronautica et Astronautica Sinica, 2025, 46(4): 330925.(in Chinese)
才华,周鸿策,付强,等. 基于多层特征嵌入的单目标跟踪算法[J]. 兵工学报, 2025, 46(3): 240062-.
CAI H, ZHOU H C, FU Q, et al. Single object tracking algorithm based on multilayer feature embedding[J]. Acta Armamentarii, 2025, 46(3): 240062.(in Chinese)
杨绪祺,谭启凡,苏航,等. 面向无人机视觉制导的自适应目标跟踪方法[J]. 兵工学报, 2025, 46(2): 240284.
YANG X Q, TAN Q F, SU H, et al. Guidance-Tracker:an adaptive UAV siamese tracker for visual guidance[J]. Acta Armamentarii, 2025, 46(2): 240284.(in Chinese)
BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]// Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, US: IEEE, 2010: 2544-2550.
HENRIQUES J F, CASEIRO R, MARTINS P, BATISTA J. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596.
BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of 2016 European Conference on Computer Vision Part II. Amsterdam, The Netherlands: Springer, 2016: 850-865.
LI B, WU W, WANG Q, et al. SiamRPN++: evolution of siamese visual tracking with very deep networks[C]// Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, US: IEEE, 2019: 4282-4291.
ZHANG Z P, PENG H W, FU J L, et al. Ocean: object-aware anchor-free tracking[C]//Proceedings of European Conference on Computer Vision. Glasgow, UK: Springer, 2020: 771-787.
GUO D Y, WANG J, CUI Y, et al. SiamCAR: siamese fully convolutional classification and regression for visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, NJ, US: IEEE, 2020: 6268-6276.
CHEN X, YAN B, ZHU J W, et al. Transformer tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, US: IEEE, 2021: 8126-8135.
YAN B, PENG H W, FU J L, et al. Learning spatio-temporal transformer for visual tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, QC, Canada: IEEE, 2021: 10428-10437.
DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[C]//Proceedings of International Conference on Learning Representations. Vienna, Austria: OpenReview.net, 2021.
LIU Z, LIN Y T, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 10012-10022.
CUI Y T, JIANG C, WU L M, et al. MixFormer: end-to-end tracking with iterative mixed attention[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(6): 4129-4146.
YE B T, CHANG H, MA B P, et al. Joint feature learning and relation modeling for tracking: a one-stream framework[C] // Computer Vision – ECCV 2022. Cham, Switzerland: Springer, 2022: 341-357.
CAI W R, LIU Q J, WANG Y H. Hiptrack: visual tracking with historical prompts[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, US: IEEE, 2024: 19258-19267.
GAO S Y, ZHOU C L, ZHANG J. Generalized relation modeling for transformer tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, BC, Canada: IEEE, 2023: 18686-18695.
WU Q Q, YANG T Y, LIU Z Q, et al. DropMAE: masked autoencoders with spatial-attention dropout for tracking tasks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, BC, Canada: IEEE, 2023: 14561-14571.
HE K, ZHANG C, XIE S, et al. Target-aware tracking with long-term context attention[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Washington, DC, US:AAAI, 2023, 37(1): 773-780.
徐晗,董仕豪,张家伟,等.融合上下文感知注意力的Transformer目标跟踪方法[J].中国图象图形学报, 2025,30(1):212-224.
XU H, DONG S H, ZHANG J W, et al. Context-aware attention fused Transformer tracking[J]. Journal of Image and Graphics, 2025, 30(1):212-224.(in Chinese)
CHEN Z, LIU L J, ZHEN Y. Transformer tracking with auxiliary search token[J]. Expert Systems with Applications, 2025, 274: 126910.
ZHANG H L, FU W Q, QI R, et al. Target-background interaction modeling transformer for object tracking[J]. Knowledge-Based Systems, 2025, 315: 113230.
CAI Y, LIU J, TANG J, et al. Robust object modeling for visual tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE, 2023: 9589-9600.
CHEN X, PENG H W, WANG D, et al. Seqtrack: sequence to sequence learning for visual object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, BC, Canada: IEEE, 2023: 14572-14581.
WEI X, BAI Y F, ZHENG Y C, et al. Autoregressive visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, BC, Canada: IEEE, 2023: 9697-9706.
SONG Z K, LUO R, YU J Q, et al. Compact transformer tracker with correlative masked modeling[C]//Proceedings of the AAAI conference on artificial intelligence. Washington, DC, US:AAAI, 2023, 37(2): 2321-2329.
QIN H L, XU T F, LI T H, et al. MUST: the first dataset and unified framework for multispectral UAV single object tracking[C]//Proceedings of the Computer Vision and Pattern Recognition Conference. Nashville, TN, US: IEEE, 2025: 16882-16891.
0
Views
2
下载量
0
CNKI被引量
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024360号