面向战场环境下的语音传输与重构

doi:10.12382/bgxb.2021.0549

摘要/Abstract

摘要： 针对语音在高压缩比及低信噪比下传输与重构质量不佳的问题，提出一种基于语谱图的语音压缩传输重构方法。在发送端将语音信号转为语谱图进行传输，再在接收端对语谱图作图像去噪处理，根据去噪后的图像恢复出语音信号的幅度谱；建立发声重构模型，用幅度谱对语音信号进行重构，实现语音恢复。实验结果表明：无噪声环境下，压缩比为10和40的条件下，重构语音质量客观平均得分达到3分以上；低信噪比条件下，压缩比为10时，重构语音质量客观平均得分也能达到2分以上。相比于传统的压缩感知语音重构算法，在高压缩比下，新方法对重构语音质量有明显改善。

关键词: 语音传输与重构, 图像增强, 发声重构模型, 压缩比及低信噪比

Abstract: A spectrogram-based reconstruction method is proposed to address the problem of poor voice transmission and reconstruction quality under conditions of high compression ratios and low signal-to-noise ratios. Speech signals are converted into spectrograms at the transmitter, which are later transmitted and denoised at the receiver. Then, the amplitude spectrum is restored from the denoised spectrogram image and the voice is reconstructed through the amplitude spectrum by the voice model. Experiments show that the perceptual evaluation of speech quality (PESQ) of the reconstructed speech exceeds 3 under noise-free environment with compression ratios of 10 and 40 respectively. The PESQ can also exceed 2 under the low signal-to-noise ratio with compression ratio of 10. The proposed method shows significant improvement in reconstructed speech quality at high compression ratios compared with the traditional algorithm.

Key words: speechtransmissionandreconstruction, imageenhancement, vocalreconstructionmodel, compressionratioandlowsignal-to-noiseratio

中图分类号:

TN912.3

邵玉斌，刘晶，龙华，李一民. 面向战场环境下的语音传输与重构[J]. 兵工学报, 2022, 43(11): 2827-2835.

SHAO Yubin, LIU Jing, LONG Hua, LI Yimin. Voice Transmission and Reconstruction on the Battlefield[J]. Acta Armamentarii, 2022, 43(11): 2827-2835.

参考文献

［1］ ZHIVAKOV E G, BELOW S P, BELOVE A S, et al. About speech data compression［J］. Journal of Fundamental and Applied Sciences,2017,9:1301-1312.
［2］ LIU W, HU A Q. A subband excitation substitute-based scheme for narrowband speech watermarking［J］. Frontiers Information Technology and Electronic Engineering, 2017, 18(5): 627-623.
［3］王晋, 刘晓静. 战场环境对作战指挥信息传输的影响［J］. 舰船电子对抗, 2010, 33(6):45-48.
WANG J, LIU X J. Influence of battlefield environment on the operation command information transmission ［J］. Shipboard Electronic Countermeasure, 2010, 33(6):45-48. (in Chinese)
［4］ DONOBO D L. Compressed sensing［J］. IEEE Transactions on Information Theory,2006,52(4):1289-1306.
［5］孙林慧,杨震.语音压缩感知研究进展与展望［J］.数据采集与处理,2015,30(2):275-288.
SUN L H, YANG Z. Research progress and prospects of speech compressed sensing［J］. Journal of Data Acquisition and Processing, 2015, 30(2): 275-288. (in Chinese)
［6］ GILL P R, WANG A, MOLNAR A. The in-crowd algorithm for fast basis pursuit denoising［J］. IEEE Transactions on Signal Process, 2011,59(10):4595-4605.
［7］ NEEDELLA D, TROPP J A. CosaMP: Iterative signal recovery from incomplete and inaccurate samples［J］. Applied and Computational Harmonic Analysis, 2009, 26(1): 301-321.
［8］ PEDERSEN N L, NAVARRO MANCHN C, BADIU M A, et al. Sparse estimation using Bayesian hierarchical prior modeling for real and complex linear models［J］. Signal Processing,2015,115: 94-109.
［9］孙林慧,杨震.基于自适应基追踪去噪的含噪语音压缩感知［J］.南京邮电大学学报(自然科学版), 2011, 31(5): 1-6.
SUN L H, YANG Z. Compressed sensing of noisy speech based on adaptive basis tracking denoising［J］. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2011, 31(5): 1-6. (in Chinese)
［10］杨真真,杨震.含噪语音压缩与重构的自适应共轭梯度投影算法［J］.仪器仪表学报,2012,33(10): 2200-2207.
YANG Z Z, YANG Z. Adaptive conjugate gradient projection algorithm for noisy speech compression and reconstruction［J］. Chinese Journal of Scientific Instrument, 2012, 33(10): 2200-2207. (in Chinese)
［11］季云云,杨震.脉冲噪声环境下高斯稀疏信源贝叶斯压缩感知重构［J］.电子学报,2013, 41(2): 363-370.
JI Y Y, YANG Z. Bayesian compressed sensing for Gaussian sparse signals in the presence of impulsive noise［J］. Acta Electronica Sinica, 2013, 41(2): 363-370. (in Chinese)
［12］张殿飞,杨震,胡海峰.含噪语音压缩感知自适应快速重构算法［J］.信号处理, 2016, 32(9): 1065-1071.
ZHANG D F, YANG Z, HU H F. Adaptive fast recovery algorithm for compressed sensing of noisy speech ［J］. Journal of Signal Processing, 2016, 32(9): 1065-1071. (in Chinese)
［13］马春,汪庆,李亚, 等.基于改进Kalman滤波I1模加速算法的语音信号重构［J］.西华大学学报(自然科学版),2021,40(4): 27-34.
MA C, WANG Q, LI Y, et al. Speech signal reconstruction based on improved Kalman filter I1 mode acceleration algorithm ［J］. Journal of Xihua University (Natural Science Edition),2021,40(4):27-34. (in Chinese)
［14］ KIM S, JUN D S, KIM B G, et al. Two-dimensional audio compression method using video coding schemes［J］. Electronics, 2021, 10(9): 1094.
［15］ HAMEED A S. Speech compression and encryption based on discrete wavelet transform and chaotic signals［J］. Multimedia Tools and Applications, 2021, 80(9): 13663-13676.
［16］ QIN L, CAO Y L, SHAO X, et al. A deep heterogeneous optimization framework for Bayesian compressive sensing［J］. Computer Communications, 2021, 178: 74-82.
［17］ COSTA Y M G, OLIVEIRA L S, KOERICB A L, et al. Music genre recognition using spectrograms［J］. Signal Processing, 2012, 92(11): 2723-2737.
［18］ ZUE V W, LAMEL L F. Expert spectrogram reader. A knowledge-based approach to speech recognition［C］∥ Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Tokyo, Japan: IEEE, 1986: 1197-1200.
［19］冯红波, 李萍, 李波. 基于自动色阶和多尺度Retinex彩色图像增强算法［J］. 无线电工程, 2019(10):910-914.
FENG H B, LI P, LI B. Color image enhancement algorithm based on multi-scale Retinex and automatic color method ［J］. Radio Engineering, 2019(10):910-914. (in Chinese)
［20］宋知用. MATLAB在语音信号分析与合成中的应用［M］. 北京: 北京航空航天大学出版社, 2013.
SONG Z Y. Application of MATLAB in speech signal analysis and synthesis ［M］. Beijing: Beijing University of Aeronautics and Astronautics Press, 2013. (in Chinese)
［21］ ALIGA J, ANDR I, DOLINSKAY＇U2 P, et al. ECG compressed sensing method with high compression ratio and dynamic model reconstruction［J］. Measurement, 2021, 183: 109803.
［22］ RAVELOMANANTSOA A, RABAH H, ROUANE A. Compressed sensing: a simple deterministic measurement matrix and a fast recovery algorithm［J］. IEEE Transactions on Instrumentation and Measurement, 2015, 64(12): 3405-3413.