Low-power speech denoising method for non-stationary noise in underground mines

YANG Yi; TAN Xiao; CHANG Yajun; WANG Keping; LIU Binbin; WANG Tian

doi:10.13225/j.cnki.jccs.2024.0796

YANG Yi，TAN Xiao，CHANG Yajun，et al. Low-power speech denoising method for non-stationary noise in underground mines[J]. Journal of China Coal Society，2025，50(7)：3692−3706. DOI: 10.13225/j.cnki.jccs.2024.0796

Citation:

Low-power speech denoising method for non-stationary noise in underground mines

Graphical Abstract

Graphical Abstract

Abstract

Abstract

The voice intercom system in the fully mechanized mining face is facing serious non-stationary noise interference. Under power consumption limitations, achieving ultra-low signal-to-noise ratio speech denoising in intercom systems is one of the core technologies to ensure the correct transmission of voice information in the working face. Based on the IMCRA algorithm, a non-stationary noise removal method MIMCRA is proposed for the speech characteristics of fully mechanized mining faces. Among them, an improved two-step noise removal method is introduced to address the problem of inaccurate estimation of non-stationary noise caused by delay in prior signal-to-noise ratio estimation. By utilizing the prior signal-to-noise ratio of the previous frame and the pure speech of the current frame to roll estimate the prior signal-to-noise ratio of the current frame and the pure speech of the next frame, real-time estimation of the prior signal-to-noise ratio is achieved. A frame frequency dynamic smoothing factor adjustment mechanism is introduced to address the problem of over estimation of noise caused by smoothing noisy power spectra with fixed smoothing factors, which makes it difficult to extract speech information. Based on the minimum mean square error of smoothed power spectral density and noise power spectral density, dynamic smoothing processing is implemented on the power spectrum of noisy speech. Aiming at the problem of low signal-to-noise ratio and incomplete noise removal, a noise existence probability detection mechanism for weak speech component protection is proposed. Based on the statistical differences in energy distribution between noise and weak speech within the frequency range of 2−4 kHz, the denoised signal is subjected to noise detection and residual noise is eliminated. The comparative experimental results show that when the input speech signal-to-noise ratio is in the range of −5−10 dB, compared with the IMCRA algorithm, our algorithm improves the segmentation signal-to-noise ratio by about 3 dB, reduces segmentation error by about 0.3, and reduces logarithmic spectral distance by about 0.2. Especially when the signal-to-noise ratio is −5 dB, the algorithm proposed in this paper can still improve the segmented signal-to-noise ratio to −2.799 5 dB, indicating that the algorithm has strong denoising ability for ultra-low signal-to-noise ratio noisy speech. The algorithm in this article has been deployed with low power consumption in the latest fully mechanized mining face to face system developed by Zhengmei Machinery, with a chip power consumption of approximately 16.5−66.0 mW; Processing speech frames with a frame length of 32 ms takes approximately 16 ms, meeting real-time requirements.