广东工业大学学报 ›› 2018, Vol. 35 ›› Issue (03): 43-46.doi: 10.12052/gdutxb.170173

• 综合研究 • 上一篇    下一篇

智能语音识别系统中噪声估计算法的研究和改进

吴楠, 冯祖勇, 韦高梧   

  1. 广东工业大学 物理与光电工程学院, 广东 广州 510006
  • 收稿日期:2017-12-11 出版日期:2018-05-09 发布日期:2018-04-26
  • 通信作者: 冯祖勇(1975-),男,教授,主要研究方向为能源材料和智能控制.E-mail:fengzuyong@foxmail.com E-mail:fengzuyong@foxmail.com
  • 作者简介:吴楠(1992-),男,硕士研究生,主要研究方向为智能通信.
  • 基金资助:
    广东省科技计划项目(2016A010104019);广州市科技计划项目(201510010285)

Research and Improvement of Noise Estimation Algorithm in Intelligent Speech Recognition System

Wu Nan, Feng Zu-yong, Wei Gao-wu   

  1. School of Physics and Optoelectronics Engineering, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2017-12-11 Online:2018-05-09 Published:2018-04-26

摘要: 智能语音识别技术的研究已有较长的时间,但由于语音信号本身所具有的多变性、瞬时性、连续性和动态性的特征,使得机器在不同的环境尤其是噪声环境中进行语音信号的识别仍具有一定的困难.为了提高带噪语音信号识别的准确率,本文研究了一种常用的噪声估计算法,即基于后验信噪比的时间递归平均算法.并在此算法的基础上提出了一种对平滑因子的改进算法,将语音活性检测算法与这两种算法在不同输入信噪比下进行模拟验证.通过运算结果的对比分析可以看出,改进后的算法相比于语音活性检测算法最高可以使输出分段SNR提高2.1 dB,相比于原时间递归平均算法最高可以使输出分段SNR提高0.5 dB,表明低输入SNR下改进后的算法可以有效提高语音信号的质量和可懂度.

关键词: 语音识别, 噪声估计, 时间递归平均算法, 平滑因子

Abstract: The research of intelligent speech recognition technology has been going on for a long time. However, due to the characteristics of variability, instantness, continuity and dynamic of the speech signal itself, the identification of the speech still has some difficulties when the machine is put in different environments, especially in the noisy environment. In order to improve the recognition accuracy of the noisy speech signal, a commonly used noise estimation algorithm was studied, which was based on the time-averaged algorithm of posterior signal noise ratio. And an improved algorithm of the smoothing factor was brought up on the basis of the previous algorithm. The voice activity detection algorithm and the above two algorithms were simulated under different input signal-noise ratios. The comparative analysis of the operation results shows that the improved algorithm can improve the output segment SNR by 2.1 dB compared with the voice activity detection algorithm, and it can also improve the output segment SNR by 0.5 dB compared with the original time recursive average algorithm. It is indicated that the improved algorithm can effectively improve the quality and intelligibility of the speech signal at low input SNR.

Key words: speech recognition, noise estimation, time averaged algorithm, smoothing factor

中图分类号: 

  • TP391.42
[1] 张永刚, 余玉平. 基于ARM的孤立语音识别系统的研究[J]. 广东工业大学学报, 2013, 30(2):95-98.ZHANG Y G, YU Y P. The design of the isolated speech recognition system based on ARM[J]. Journal of Guangdong University of Technology, 2013, 30(2):95-98.
[2] 刘雨燃. 语音识别技术的探究[J]. 中国科技纵横, 2016, 24:26-27.LIU Y R. Research on speech recognition technology[J]. China Science & Technology Panorama Magazine, 2016, 24:26-27.
[3] 刘金刚, 周翊, 马永保, 等. 用于自动语音识别系统的切换语音功率谱估计算法[J]. 计算机应用, 2016, 36(12):3369-3373.LIU J G, ZHOU Yi, MA Y B, et al. Estimation algorithm of switching speech power spectrum for automatic speech recognition system[J]. Journal of Computer Applications, 2016, 36(12):3369-3373.
[4] 王华彬, 张建伟, 陶亮. 噪声谱估计算法对语音可懂度的影响[J]. 声学技术, 2015, 34(5):424-430.WANG H B, ZHANG J W, TAO L. Effects of noise spectrum estimation algorithms on speech intelligibility[J]. Technical Acoustics, 2015, 34(5):424-430.
[5] 徐子豪, 张腾飞. 基于语音识别和无线传感网络的智能家居系统设计[J]. 计算机测量与控制, 2012, 20(1):180-182.XU Z H, ZHANG T F. Design of smart home system based on speech recognition and wireless sensor network[J]. Computer Measurement & Control, 2012, 20(1):180-182.
[6] 祁琳娜. 语音增强改进算法研究及其DSP的实现[D]. 西安:长安大学信息学院, 2016.
[7] YANG L, LOIZOU P C. Speech enhancement by combining statistical estimators of speech and noise[C]//IEEE International Conference on Acoustics Speech and Signal Processing. Dallas:IEEE, 2010:4754-4757.
[8] 郑永敏, 鲍鸿, 张晶. 基于维纳–小波分析的语音去噪新方法[J]. 广东工业大学学报, 2017, 34(5):52-55.ZHENG Y M, BAO H, ZHANG J. A new speech denoising method based on Wiener Filtering and Wavelet analysis[J]. Journal of Guangdong University of Technology, 2017, 34(5):52-55.
[9] LU Y, LOIZOU P C. A geometric approach to spectral subtraction[J]. Speech Communication, 2008, 50:453-466.
[10] 程宁, 刘文举. 基于高斯-拉普拉斯-伽玛模型和人耳听觉掩蔽效应的信号子空间语音增强算法[J]. 声学学报, 2009, 34(6):554-565.CHENG N, LIU W J. A subspace speech enhancement algorithm based on Gaussian-Laplacian-Gamma statistical models and masking properties of human ears[J]. Acta Acustica, 2009, 34(6):554-565.
[11] LU Y, LOIZOU P C. Estimators of the magnitude squared spectrum and methods for incorporating SNR uncertainty[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(5):1123-1137.
[12] 欧世峰, 赵晓晖. 改进型先验信噪比估计语音增强算法[J]. 吉林大学学报(工学版), 2009, 39(3):787-791.OU S F, ZHAO X H. Modified priori-SNR estimation for noisy speech enhancement[J]. Journal of Jilin University (Engineering and Technology Edition), 2009, 39(3):787-791.
[13] 恩德, 陈亚柯, 毛哲龙. 基于FastICA的低信噪比下L-PLC语音的间断传输[J]. 计算机工程与应用, 2016, 52(9):108-111.EN D, CHEN Y K, MAO Z L. Discontinuous transmission of voice in L-PLC under low SNR based on FastICA[J]. Computer Engineering and Applications, 2016, 52(9):108-111.
[14] LIN L, HOLMES W H, AMBIKAIRAJAH E. Subband noise estimation for speech enhancement using a perceptual Wiener filter[J]. 2003 IEEE International Conference on Acoustics, 2003, 1(1):80-83.
[15] COHEN I. Noise spectrum estimation in adverse environments:improved minimal controlled recursive averaging[J]. IEEE Transactions on Speech and Audio Processing, 2003, 11(5):466-475.
[16] 王鹏, 曾毓敏. 基于双向搜索方法的最小值控制递归平均语音增强算法[J]. 声学学报(中文版), 2010, 35(1):81-87.WANG P, ZENG Y M. Speech enhancement approach based on minimal controlled recursive averaging algorithm using bidirectional searching method[J]. Acta Acustica, 2010, 35(1):81-87.
[17] LIN L, HOLMES W H, AMBIKAIRAJAH E. Adaptive noise estimation algorithm for speech enhancement[J]. Electronics Letters, 2003, 39(9):754-755.
[1] 刘荣辉, 彭世国, 刘国英. 基于智能家居控制的嵌入式语音识别系统[J]. 广东工业大学学报, 2014, 31(2): 49-53.
[2] 张永刚, 余玉平. 基于ARM的孤立语音识别系统的研究[J]. 广东工业大学学报, 2013, 30(2): 95-98.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!