海军潜艇学院, 山东 青岛 266041
*邮箱: estella126@126.com
**邮箱: jinpingw@126.com
收稿:2023-07-23,
网络出版:2024-10-30,
纸质出版:2024-10-31
移动端阅览
杨静, 吴金平, 刘剑, 等. 一种半监督学习潜艇规避防御智能决策方法[J]. 兵工学报, 2024,45(10):3474-3487.
Jing YANG, Jinping WU, Jian LIU, et al. A Semi-supervised Learning Method for Intelligent Decision Making of Submarine Maneuvering Evasion[J]. Acta Armamentarii, 2024, 45(10): 3474-3487.
杨静, 吴金平, 刘剑, 等. 一种半监督学习潜艇规避防御智能决策方法[J]. 兵工学报, 2024,45(10):3474-3487. DOI: 10.12382/bgxb.2023.0684.
Jing YANG, Jinping WU, Jian LIU, et al. A Semi-supervised Learning Method for Intelligent Decision Making of Submarine Maneuvering Evasion[J]. Acta Armamentarii, 2024, 45(10): 3474-3487. DOI: 10.12382/bgxb.2023.0684.
潜艇水下作战行动中
受到水下弱可观测环境影响
获取的目标信息呈现稀疏特性。机动规避是潜艇水下防御的重要战术方法
现有机动规避参数仿真与优化方法在建模时不可避免引入观测误差
缺乏对态势演变的应对手段
且由于军事专家的稀缺性
获取军事专家标签的战术对抗样本代价十分昂贵。针对上述困难提出一种基于自编码与主动Q学习策略结合的半监督学习智能决策方法。通过引入对比预测编码自编码器
最大化时序输入与上下文间互信息熵
提高对稀疏时序输入的表征能力。将表征输入与主动强化学习任务相结合
降低智能体的标签需求率
提高规避决策时对环境反馈的能力。基于3a采集的指挥员战法研练复盘数据构建上帝视角、红方视角数据集。实验结果表明:所提算法与不采用稀疏时序自编码器的算法消融实验
在完全信息、红方视角条件下决策精度分别达到98%、78%
而标签需求率仅为4%、44%;相比于经典的时序分类算法决策精度提高了14%、9%
与有监督算法相比在标签需求率降低为原来的24%~44%条件下
决策精度误差与有监督算法仅差1%
说明所提算法在保证决策精度的同时可大幅降低标签需求量
从而为少量样本条件下的军事智能决策提供一种通用的技术框架。
When a submarine defends against the incoming torpedoes
it is subjected to the weakly observable environment under water
and the target information obtained is sparse. The setting of maneuvering parameters is a key part of submarine tactical decision-making. The existing methods for setting the maneuvering parameters inevitably introduce observation errors in modeling
there is lack of a means to respond to the evolution of situation
and due to the scarcity of military experts
and it is very expensive to obtain the flexible tactical confrontation samples of military experts. To solve the above difficulties
an intelligent tactical decision-making method based on the combination of self-coding and active Q-learning strategy is proposed. By introducing a contrasting predictive coding autoencoder
the mutual information entropy between the time series input and the context is maximized
and the representation ability of sparse time series input is improved. The representation input is combined with the active reinforcement learning task to reduce the label demand rate of the agent and improve the environmental feedback ability of parameter setting. The datasets of God perspective and red perspective are constructed based on the data collected in the past three years. Experiments based on this dataset show that the decision accuracies of the proposed method and the model ablation experiment without sparse time series auto-encoder reach 98% and 78%
respectively
while their label demand rates are only 4% and 44%
respectively. Compared with the proposed method and the classical time series classification model
the decision accuracy of the proposed method is improved by 14% and 9%
and the decision accuracy error compared with real human action is only 1% different from that of the supervised model under the condition that the label demand rate is reduced to 24%~44%. It is explained that the proposed model can greatly reduce the label demand while ensuring the decision-making accuracy.
杨震 , 赵娟 . 论当代中国的海洋军事观:制海权与海上反介入 [J ] . 复旦国际关系评论 , 2015 ( 2 ): 160 - 179 .
YANG Z , ZHAO J . On contemporary China’s maritime military view: command of the sea and maritime anti intervention [J ] . Fudan International Studies Review , 2015 ( 2 ): 160 - 179 . (in Chinese)
佚名 . 俄拟于年内完成新型鱼雷测试 [J ] . 现代军事 , 2017 , 4 ( 4 ): 13 .
Anon . Russia plans to complete testing of new torpedoes within the year [J ] . Modern Military , 2017 , 4 ( 4 ): 13 . (in Chinese)
何心怡 , 卢军 , 张思宇 , 等 . 国外鱼雷现状与启示 [J ] . 数字海洋与水下攻防 , 2020 , 3 ( 2 ): 87 - 93 .
HE X Y , LU J , ZHANG S Y , et al . The current situation and enlightenment of torpedoes abroad [J ] . Digital Ocean and Underwater Attack and Defense , 2020 , 3 ( 2 ): 87 - 93 . (in Chinese)
吴金平 . 潜艇作战建模与仿真 [M ] . 北京 : 国防工业出版社 , 2017 .
WU J P . Submarine combat modeling and simulation [M ] . Beijing : National Defense Industry Press , 2017 . (in Chinese)
施征 . 俄罗斯潜艇消音技术[续] [J ] . 现代舰船 , 2002 ( 7 ): 25 - 27 .
SHI Z . Russian submarine silencing technology [continued] [J ] . Modern Ships , 2002 ( 7 ): 25 - 27 . (in Chinese)
瞿幼苗 . 面向智能决策的推理引擎技术 [D ] . 西安 : 西北工业大学 , 2018 .
QU Y M . Reasoning engine technology for intelligent decision making [D ] . Xi’an : Northwestern Polytechnical University , 2018 . (in Chinese)
王璐 , 霍其恩 , 李青山 , 等基于并行搜索优化的指控系统自适应决策方法 [J ] . 软件学报 , 2022 , 33 ( 5 ): 1774 - 1799 .
WANG L , HUO Q E , LI Q S , et al . Self-adaptation decision-making based on parallel search optimization for command and control information system [J ] . Journal of Software , 2022 , 33 ( 5 ): 1774 - 1799 . (in Chinese)
张磊潇 , 胡伟文 , 孙慧玲 . 舰艇综合防御鱼雷的作战决策及其关联分析 [J ] . 兵工学报 , 2020 , 41 ( 5 ): 967 - 974 . DOI: 10.3969/j.issn.1000-1093.2020.05.016 http://doi.org/10.3969/j.issn.1000-1093.2020.05.016 深弹拦截与机动规避综合防御声自导鱼雷是舰艇水下防御作战的一种新样式。对舰艇生存概率与战场态势展开关联分析,可为训练分析、战法研讨乃至作战决策提供精细化的参考。在该作战仿真及策略优化问题的研究过程中,形成了“深弹拦截+背转规避”与“舰艇机动+深弹拦截”两种作战策略。比较不同战场态势下的舰艇生存概率,粗略划分两种作战策略的适用区域,并分析了两种策略的综合防御效果;以梯度分析、弹性分析方法分别给出临界区域的数值计算示例,给出不同战场态势下的作战建议。研究结果表明,在不同战场态势下,舰艇应采取对应的较优作战策略,特别应加强对远距离、正横方向来袭鱼雷的预警探测能力。
ZHANG L X , HU W W , SUN H L . Combat decision and correlation analysis of warship integrated defense torpedo [J ] . Acta Armamentarii , 2020 , 41 ( 5 ): 967 - 974 . (in Chinese) DOI: 10.3969/j.issn.1000-1093.2020.05.016 http://doi.org/10.3969/j.issn.1000-1093.2020.05.016 A new style of underwater warship defense is to integrally defend the acoustic self-guided torpedo with depth charge interception and maneuvering evasion. The correlation analysis for the survival probability of warship and the battlefield situation can provide elaborate reference for training analysis, discussion about tactics and even combat decision-making. In the research process of combat simulation and strategy optimization, two combat strategies of “depth charge interception + backward evasion” and “warship maneuvering + depth charge interception” are formed. The applicable regions of the two combat strategies are roughly divided, and the integral defense effects of the two strategies are analyzed by comparing the survival probabilities of warship under different battlefield situations. The numerical analysis examples of critical regions and the combat suggestions under different battlefield situations are given by using gradient analysis and elastic analysis methods, respectively. This study shows that a corresponding superior combat strategy should be adopted under different battlefield situations, especially, for early warning and detection of torpedo in long-distance and horizontal direction. Key
曲泓玥 . 基于被动声纳实景仿真的水声对抗性能优化 [D ] . 哈尔滨 : 哈尔滨工程大学 , 2020 .
QU H Y . Optimization of underwater acoustic countermeasures performance based on passive sonar real scene simulation [D ] . Harbin : Harbin Engineering University , 2020 . (in Chinese)
GENG T , ZHANG A , LU G S . Consensus intuitionistic fuzzy group decision-making method for aircraft cockpit display and control system evaluation [J ] . Journal of Systems Engineering and Electronics , 2013 , 24 ( 4 ): 634 - 641 . DOI: 10.1109/JSEE.2013.00074 http://doi.org/10.1109/JSEE.2013.00074 <p align="justify">A novel group decision-making (GDM) method based&nbsp;on intuitionistic fuzzy sets (IFSs) is developed to evaluate the ergonomics of aircraft cockpit display and control system (ACDCS). The GDM process with four steps is discussed. Firstly, approaches are proposed to transform four types of common judgement representations into a unified expression by the form of the IFS, and the features of unifications are analyzed. Then, the aggregation operator called the IFSs weighted averaging (IFSWA) operator is taken to synthesize decision-makers&rsquo; (DMs&rsquo;) preferences by the form of the IFS. In this operator, the DM&rsquo;s reliability weights factors are determined based on the distance measure between their preferences. Finally, an improved score function is used to rank alternatives and to get the best one. An illustrative example proves the proposed method is effective to valuate the ergonomics of the ACDCS.</p>
Anon . Defender/deceptor acoustic countermeasures [J ] . Jane’s Defence Weekly , 2017 , 10 ( 11 ): 1 - 4 .
JIANG W , HAN D Q , FAN X , et al . Research on threat assessment based on dempster-shafer evidence theory [C ] // Green Communications and Networks . Dordrecht,the Netherlands : Springer , 2012 , 113 (Part 1): 975 - 984 .
陈保香 , 曹奇英 , 夏祖勋 . 案例推理在海军战术决策中的应用 [J ] . 华东船舶工业学院学报 , 2000 ( 5 ): 45 - 49 .
CHEN B X , CAO Q Y , XIA Z X . The application of case based reasoning in naval tactical decision making [J ] . Journal of East China Shipbuilding Institute of Technology , 2000 ( 5 ): 45 - 49 . (in Chinese)
ZHOU H Y , ZHANG S H , PENG J Q , et al . Informer: beyond efficient transformer for long sequence time-series forecasting: arXiv:2012.07436 [R/OL ] . Ithaca, NY , US : Cornell University , 2020 (2020-12-14). https://arxiv.org/abs/2012.07436. https://arxiv.org/abs/2012.07436 https://arxiv.org/abs/2012.07436
JEHA P , BOHLKE S M , MERCADO P , et al . PSA-GAN: progressive self attention GANs for synthetic time series: arXiv:2108.00981 [R/OL ] . Ithaca, NY , US : Cornell University , 2021 (2021-08-02). https://arxiv.org/abs/2108.00981. https://arxiv.org/abs/2108.00981 https://arxiv.org/abs/2108.00981
LAI G , CHANG W C , YANG Y , et al . Modeling long- and short-term temporal patterns with deep neural networks: arXiv: 1703.07015 [R/OL ] . Ithaca, NY , US : Cornell University , 2017 (2017-03-21). https://arxiv.org/abs/1703.07015. https://arxiv.org/abs/1703.07015 https://arxiv.org/abs/1703.07015
LEWIS D D , GALE W A . A sequential algorithm for training text classifiers: arXiv:cmp-lg/9407020 [R/OL ] . Ithaca, NY , US : Cornell University , 1994 (1994-07-24). https://arxiv.org/abs/cmp-lg/9407020. https://arxiv.org/abs/cmp-lg/9407020 https://arxiv.org/abs/cmp-lg/9407020
TONG S , KOLLER D . Support vector machine active learning with applications to text classification [J ] . The Journal of Machine Learning Research , 2001 , 2 : 45 - 66 .
FREUND Y , SEUNG H S , SHAMIR E , et al . Selective sampling using the query by committee algorithm [J ] . Machine Learning , 1997 , 28 ( 2/3 ): 133 - 168 .
YANG T , FENG Y , ChENG G , et al . Critical events based resource layer structure dynamic adaptive optimization method [J ] . IEEE Access , 2019 , 7 : 36710 - 36721 .
DAYOUB F , SÜNDERHAUF N , CORKE P I . Episode-based active learning with Bayesian neural networks [C ] // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, HI, US:IEEE , 2017 : 498 - 500 .
JOSHI A , PORIKLI F , PAPANI K N . Scalable active learning for multiclass image classification [J ] . IEEE transactions on pattern analysis and machine intelligence , 2012 , 34 : 2259 - 2273 . Machine learning techniques for computer vision applications like object recognition, scene classification, etc., require a large number of training samples for satisfactory performance. Especially when classification is to be performed over many categories, providing enough training samples for each category is infeasible. This paper describes new ideas in multiclass active learning to deal with the training bottleneck, making it easier to train large multiclass image classification systems. First, we propose a new interaction modality for training which requires only yes-no type binary feedback instead of a precise category label. The modality is especially powerful in the presence of hundreds of categories. For the proposed modality, we develop a Value-of-Information (VOI) algorithm that chooses informative queries while also considering user annotation cost. Second, we propose an active selection measure that works with many categories and is extremely fast to compute. This measure is employed to perform a fast seed search before computing VOI, resulting in an algorithm that scales linearly with dataset size. Third, we use locality sensitive hashing to provide a very fast approximation to active learning, which gives sublinear time scaling, allowing application to very large datasets. The approximation provides up to two orders of magnitude speedups with little loss in accuracy. Thorough empirical evaluation of classification accuracy, noise sensitivity, imbalanced data, and computational performance on a diverse set of image datasets demonstrates the strengths of the proposed algorithms.
DONMEZ P , CARBONELL J , SCHNEIDER J . A probabilistic framework to learn from multiple annotators with time-varying accuracy [C ] // Proceedings of the 2010 SIAM International Conference on Data Mining.Columbus, OH , US : SIAM , 2010 : 826 - 837 .
SHENG V , PROVOST F , IPEIROTIS P . Get another label? improving data quality and data mining using multiple, noisy labelers [C ] // Proceedings of the 14th ACMKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas, NV , US : Association for Computing Machinery , 2008 : 614 - 622 .
赵东方 . 主动探索强化学习算法研究 [D ] . 哈尔滨 : 哈尔滨工业大学 , 2020 .
ZHAO D F . Active exploration of reinforcement learning algorithm [D ] . Harbin : Harbin Institute of Technology , 2020 . (in Chinese)
CHEN L , ZHANG Y L , FENG Y H , et al . A human-machine agent based on active reinforcement learning for target classification in wargame [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2023 : 1 - 13 .
LIU Z M , WANG J Y , GONG S G , et al . Deep reinforcement active learning for human-in-the-loop person re-identification [C ] // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea:IEEE , 2019 : 6121 - 6130 .
HUANG H L , HUANG J C , FENG Y H , et al . On the improvement of reinforcement active learning with the involvement of cross entropy to address one-shot learning problem [J ] . PLoS ONE , 2019 , 14 ( 6 ): e0217408 .
LAI C I . Contrastive predictive coding based feature for automatic speaker verification: arXiv:1904.01575 [R/OL ] . Ithaca, NY , US : Cornell University , 2019 (2019-04-01). https://arxiv.org/abs/1904.01575. https://arxiv.org/abs/1904.01575 https://arxiv.org/abs/1904.01575
丁永忠 . 潜射自航式声诱饵发射方向研究 [J ] . 航空计算技术 , 2014 , 44 ( 6 ): 59 - 61 , 66.
DING Y Z . Research on the launch direction of submarine launched self propelled acoustic bait [J ] . Aeronautical computing technology , 2014 , 44 ( 6 ): 59 - 61 , 66. (in Chinese)
张方方 , 李文哲 , 董晓明 , 等 . 噪声干扰器作用下反潜鱼雷主动自导性能数值分析 [J ] . 水下无人系统学报 , 2020 , 28 ( 1 ): 33 - 38 .
ZHANG F F , LI W Z , DONG X M , et al . Numerical analysis of active homing performance of anti submarine torpedo under the action of noise jammer [J ] . Journal of Underwater Unmanned Systems , 2020 , 28 ( 1 ): 33 - 38 . (in Chinese)
陈颜辉 . 水面舰艇综合防御鱼雷决策关键技术 [J ] . 火力与指挥控制 , 2019 , 44 ( 6 ): 102 - 105 .
CHEN Y H . Key technologies for decision-making of Surface combatant integrated defense torpedo [J ] . Fire and Command and control , 2019 , 44 ( 6 ): 102 - 105 . (in Chinese)
HOCHREITER S , SCHMIDHUBER J . Long short-term memory [J ] . Neural computation , 1997 , 9 : 1735 - 1780 . DOI: 10.1162/neco.1997.9.8.1735 http://doi.org/10.1162/neco.1997.9.8.1735 Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
LI H T , CHEN W C , LEVY A , et al . One-shot learning with memory-augmented neural networks using a 64-kbit, 118 GOPS/W RRAM-based non-volatile associative memory [C ] // Proceedings of 2021 Symposium on VLSI Technology. Kyoto, Japan:IEEE , 2021 : 1 - 2 .
0
浏览量
1407
下载量
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024360号