Welcome to Acta Armamentarii ! Today is

Acta Armamentarii ›› 2024, Vol. 45 ›› Issue (10): 3474-3487.doi: 10.12382/bgxb.2023.0684

Previous Articles     Next Articles

A Semi-supervised Learning Method for Intelligent Decision Making of Submarine Maneuvering Evasion

YANG Jing*(), WU Jinping**(), LIU Jian, WANG Yongjie, DONG Hanquan   

  1. Navy Submarine College, Qingdao 266041, Shandong, China
  • Received:2023-07-23 Online:2024-01-11
  • Contact: YANG Jing, WU Jinping

Abstract:

When a submarine defends against the incoming torpedoes, it is subjected to the weakly observable environment under water, and the target information obtained is sparse. The setting of maneuvering parameters is a key part of submarine tactical decision-making. The existing methods for setting the maneuvering parameters inevitably introduce observation errors in modeling, there is lack of a means to respond to the evolution of situation, and due to the scarcity of military experts, and it is very expensive to obtain the flexible tactical confrontation samples of military experts. To solve the above difficulties, an intelligent tactical decision-making method based on the combination of self-coding and active Q-learning strategy is proposed. By introducing a contrasting predictive coding autoencoder, the mutual information entropy between the time series input and the context is maximized, and the representation ability of sparse time series input is improved. The representation input is combined with the active reinforcement learning task to reduce the label demand rate of the agent and improve the environmental feedback ability of parameter setting. The datasets of God perspective and red perspective are constructed based on the data collected in the past three years. Experiments based on this dataset show that the decision accuracies of the proposed method and the model ablation experiment without sparse time series auto-encoder reach 98% and 78%, respectively, while their label demand rates are only 4% and 44%, respectively. Compared with the proposed method and the classical time series classification model, the decision accuracy of the proposed method is improved by 14% and 9%, and the decision accuracy error compared with real human action is only 1% different from that of the supervised model under the condition that the label demand rate is reduced to 24%~44%. It is explained that the proposed model can greatly reduce the label demand while ensuring the decision-making accuracy.

Key words: submarine evasion defense, sparse labels, active Q-learning, self coding, intelligent decision-making

CLC Number: