1. 北京理工大学 宇航学院 飞行器动力学与控制教育部重点实验室, 北京 100081
2. 杭州极弱磁场国家重大科技基础设施研究院, 浙江 杭州 310051
*邮箱: zhangcheng@bit.edu.cn
收稿:2023-08-10,
网络出版:2024-10-30,
纸质出版:2024-10-31
移动端阅览
孙凯, 张成, 詹天, 等. 融合注意力机制和多层动态形变卷积的多视图立体视觉重建方法[J]. 兵工学报, 2024,45(10):3631-3641.
Kai SUN, Cheng ZHANG, Tian ZHAN, et al. Multi-view Stereo Vision Reconstruction Network with Fusion Attention Mechanism and Multi-layer Dynamic Deformable Convolution[J]. Acta Armamentarii, 2024, 45(10): 3631-3641.
孙凯, 张成, 詹天, 等. 融合注意力机制和多层动态形变卷积的多视图立体视觉重建方法[J]. 兵工学报, 2024,45(10):3631-3641. DOI: 10.12382/bgxb.2023.0740.
Kai SUN, Cheng ZHANG, Tian ZHAN, et al. Multi-view Stereo Vision Reconstruction Network with Fusion Attention Mechanism and Multi-layer Dynamic Deformable Convolution[J]. Acta Armamentarii, 2024, 45(10): 3631-3641. DOI: 10.12382/bgxb.2023.0740.
针对现有多视图立体视觉(Multi-View Stereo
MVS)技术提取弱纹理区域和非郎伯体曲面特征信息不充分及重建效果不理想问题
提出一种融合注意力机制和多层动态形变卷积的AMDC-PatchmatchNet方法。构建一种融合坐标注意力的特征提取网络
能更准确地捕捉重建对象的边缘形状和纹理特征
同时融合一种基于动态形变卷积的自适应感受野模块
根据不同尺度的特征自适应调整感受野的大小和形状
获得兼具全局和细节的特征表示。在DTU数据集上的测试结果表明
所提方法相较于主流MVS方法
点云重建整体性指标提高2.8%
并且在航空影像数据集上验证了模型的泛化能力。
The existing multi-view stereo vision technology is not enough to extract the feature information of weak texture region and non-Lambert surface
and its reconstruction effect is not ideal. An AMDC-PatchmatchNet method with fusion attention mechanism and multi-layer dynamic deformable convolution is proposed for the problems above. In this method
a feature extraction network integrating the coordinate attention is constructed
which can capture the edge shape and texture features of reconstructed objects more accurately. At the same time
an adaptive receptive field module based on dynamic deformable convolution is integrated in the feature extraction network
and the size and shape of receptive field can be adjusted adaptively according to different scales of features to obtain both global and detailed feature representation. The generalization ability of the AMDC-PatchmatchNet method is verified on the aerial image data sets. The test results on DTU data sets show that the overall index of point cloud reconstruction of the proposed method is improved by 2.8% compared with those of mainstream MVS methods.
蒋超 , 崔玉伟 , 王辉 . 基于图像的无人机战场态势感知技术综述 [J ] . 测控技术 , 2021 , 40 ( 12 ): 14 - 19 .
JIANG C , CUI Y W , WANG H . Review of image-based UAV battlefield situation awareness technology [J ] . Measurement and Control Technology , 2021 , 40 ( 12 ): 14 - 19 . (in Chinese)
纪广 , 郝建国 , 张振伟 . 面向无人机作战的虚拟孪生系统设计方案 [J ] . 兵工学报 , 2022 , 43 ( 8 ): 1902 - 1912 .
JI G , HAO J G , ZHANG Z W . Design scheme of virtual twin system for UAV combat [J ] . Acta Armamentarii , 2022 , 43 ( 8 ): 1902 - 1912 . (in Chinese) DOI: 10.12382/bgxb.2021.0408 http://doi.org/10.12382/bgxb.2021.0408 To deal with the low precision of the UAV model in the process of combat simulation, the difficulty of virtualreality interactive operation, and the weak combat experience of commanders, the design scheme of the combat simulation system is proposed based on virtual twin technology, which is used for synchronous operation of actual UAV combat and simulation/deduction, and outputting intelligent assisted decision-making according to the battlefield situation. On the basis of digital twin, the connotation of virtual twin technology is proposed. Combined with the advanced means of artificial intelligence, the framework of virtual twin system for UAV combat is designed. Finally, taking the combat process of individual combat quadrotor UAV as an example, the hardware and software architecture is designed, and the operation process is analyzed. The case study findings shows that the model under the virtual twin system runs more accurately, and can realize the virtual-reality synchronization, interactive operation, visual interface and intelligent decision-making real-time output, so that the commander's sense of participation is enhanced, and the combat command efficiency is improved.
龙霄潇 , 程新景 , 朱昊 , 等 . 三维视觉前沿进展 [J ] . 中国图象图形学报 , 2021 , 26 ( 6 ): 1389 - 1428 .
LONG X X , CHENG X J , ZHU H , et al . Advances in 3D vision [J ] . Journal of Image and Graphics , 2021 , 26 ( 6 ): 1389 - 1428 . (in Chinese)
张宗华 , 刘巍 , 刘国栋 , 等 . 三维视觉测量技术及应用进展 [J ] . 中国图象图形学报 , 2021 , 26 ( 6 ): 1483 - 1502 .
ZHANG Z H , LIU W , LIU G D , et al . Progress of 3D visual measurement technology and its application [J ] . Journal of Image and Graphics , 2019 , 26 ( 6 ): 1483 - 1502 . (in Chinese)
赵双赫 . 基于双目立体视觉的实时三维重建系统研究 [D ] . 西安 : 西安电子科技大学 , 2022 .
ZHAO S H . Research on real-time 3D reconstruction system based on binocular stereo vision [D ] . Xi’an : Xidian University , 2022 . (in Chinese)
GALLIANI S , LASINGER K , SCHINDLER K . Massively parallel multiview stereopsis by surface normal diffusion [C ] // Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile:IEEE , 2015 : 873 - 881 .
SCHÖNBERGER J L , FRAHM J M . Structure-from-motion revisited [C ] // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, US:IEEE , 2016 : 4104 - 4113 .
朱红军 , 高潮 , 郭永彩 . 基于计算机视觉的非朗伯表面三维重构 [J ] . 强激光与粒子束 , 2014 , 26 ( 1 ): 295 - 305 .
ZHU H J , GAO C , GUO Y C . 3D reconstruction of non-Lambertian surfaces based on computer vision [J ] . High Power Laser and Particle Beams , 2014 , 26 ( 1 ): 295 - 305 . (in Chinese)
VORONIN V , FRANTC V , SEMENISHCHEV E , et al . 3D shape object reconstruction with non-Lambertian surface from multiple views based on deep learning [C ] // Proceedings of 2022 SPIE The International Society for Optical Engineering. Orlando, FL, US:SPIE , 2022 : 296 - 303 .
陈龙 , 张建林 , 彭昊 , 等 . 多尺度注意力与领域自适应的小样本图像识别 [J ] . 光电工程 , 2023 , 50 ( 4 ): 67 - 80 .
CHEN L , ZHANG J L , PENG H , et al . Multi-scale attention and domain adaptive small sample image recognition [J ] . Opto-Electronic Engineering , 2023 , 50 ( 4 ): 67 - 80 . (in Chinese)
杜小强 , 李卓林 , 马锃宏 , 等 . 基于空间注意力和可变形卷积的无人机田间障碍物检测 [J ] . 农业机械学报 , 2023 , 54 ( 2 ): 275 - 283 .
DU X Q , LI Z L , MA Z H , et al . Unmanned aerial vehicle field obstacle detection based on spatial attention and deformable Convolution [J ] . Transactions of the Chinese Society for Agricultural Machinery , 2023 , 54 ( 2 ): 275 - 283 . (in Chinese)
YAO Y , LUO Z X , LI S W , et al . MVSNet: depth inference for unstructured multi-view stereo [C ] // Proceedings of 2018 Springer Verlag European Conference on Computer Vision.Munich , Germany : Springer Verlag , 2018 : 767 - 783 .
CHEN R , HAN S F , XU J , et al . Point-based multi-view stereo network [C ] // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Seoul , Korea (South) : IEEE , 2019 ,: 1538 - 1547 .
LUO K Y , GUAN T , JU L L , et al . P-MVSNet: learning patch-wise matching confidence aggregation for multi-view stereo [C ] // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.Seoul , Korea (South) : IEEE , 2019 : 10451 - 10460 .
YANG J Y , MAO W , LIU M M , et al . Cost volume pyramid based depth inference for multiview stereo [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle, WA, US:IEEE , 2020 : 4876 - 4885 .
WANG F J H , GALLIANI S , VOGEL C , et al . PatchmatchNet: learned multi-view patchmatch stereo [C ] // Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, US:IEEE , 2021 : 14189 - 14198 .
TSOTSOS J K . A computational perspective on visual attention [M ] . Cambridge, MA, US : MIT Press , 2021 .
LIU J J , HOU Q , CHENG M M , et al . Improving convolutional networks with self-calibrated convolutions [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, US:IEEE , 2020 : 10093 - 10102 .
BELLO I , ZOPH B , LE Q , et al . Attention augmented convolutional networks [C ] // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul , Korea (South) : IEEE , 2019 : 3285 - 3294 .
FU J , LIU J , TIAN H J , et al . Dual attention network for scene segmentation [C ] // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, US:IEEE , 2019 : 3141 - 3149 .
SHEN Z , NGUYEN C . Temporal 3D RetinaNet for fish detection [C ] // Proceedings of 2020 IEEE Digital Image Computing:Techniques and Applications.Melbourne, Australia:IEEE , 2020 : 1 - 5 .
HOU Q B , ZHOU D Q , FENG J S . Coordinate attention for efficient mobile network design [C ] // Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN , US : IEEE , 2021 : 13708 - 13717 .
WEI S E , RAMAKRISHNA V , KANADE T , et al . Convolutional pose machines [C ] // Proceedings of IEEE Conference Computer Vision and Pattern Recognition. Las Vegas, NV, US:IEEE , 2016 : 4724 - 4732 .
DAI J , QI H , XIONG Y , et al . Deformable convolutional networks [C ] // Proceedings of IEEE International Conference on Computer Vision.Venice, Italy:IEEE , 2017 : 764 - 773 .
JENSEN R , DAHL A , VOGIATZIS G , et al . Large scale multi-view stereopsis evaluation [C ] //.Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, US:IEEE , 2014 : 406 - 413 .
Discover a wide range of drone datasets senseFlyDS .(2013-12-26)[2022-07-08 ] . https://www.sensefly.com/education/datasets/. https://www.sensefly.com/education/datasets/ https://www.sensefly.com/education/datasets/
CAMPBELL N , VOGIATZIS G , HERNÁNDEZ C , et al . Using multiple hypotheses to improve depth maps for multi-view stereo [M ] . Berlin, Germany : Springer-Verlag , 2008 : 766 - 779 .
FURUKAWA Y , PONCE J . Accurate, dense, and robust multiview stereopsis [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2010 , 32 ( 8 ): 1362 - 1376 . DOI: 10.1109/TPAMI.2009.161 http://doi.org/10.1109/TPAMI.2009.161 This paper proposes a novel algorithm for multiview stereopsis that outputs a dense set of small rectangular patches covering the surfaces visible in the images. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these before using visibility constraints to filter away false matches. The keys to the performance of the proposed algorithm are effective techniques for enforcing local photometric consistency and global visibility constraints. Simple but effective methods are also proposed to turn the resulting patch model into a mesh which can be further refined by an algorithm that enforces both photometric consistency and regularization constraints. The proposed approach automatically detects and discards outliers and obstacles and does not require any initialization in the form of a visual hull, a bounding box, or valid depth ranges. We have tested our algorithm on various data sets including objects with fine surface details, deep concavities, and thin structures, outdoor scenes observed from a restricted set of viewpoints, and "crowded" scenes where moving obstacles appear in front of a static structure of interest. A quantitative evaluation on the Middlebury benchmark shows that the proposed method outperforms all others submitted so far for four out of the six data sets.
ENGIN T , CHRISTOPH S , PASCAL F . Efficient large-scale multi-view stereo for ultra high-resolution image sets [J ] . Machine Vision and Applications , 2011 , 23 ( 5 ): 903 - 920 .
YAO Y , LUO Z X , LI S W , et al . Recurrent MVSNet for high-resolution multi-view stereo depth inference [C ] // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach, CA, US:IEEE , 2019 : 5520 - 5529 .
YU Z H , GAO S H . Fast-MVSNet: sparse-to-dense multi-view stereo with learned propagation and Gauss-Newton refinement [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, WA, US:IEEE , 2020 : 1946 - 1955 .
SELVARAJU R R , COGSWELL M , DAS A , et al . Grad-CAM: visual explanations from deep networks via gradient-based localization [C ] // Proceedings of 2017 IEEE International Conference on Computer Vision.Venice, Italy:IEEE , 2017 : 618 - 626 .
0
浏览量
3491
下载量
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024360号