广东工业大学学报 ›› 2015, Vol. 32 ›› Issue (04): 145-149.doi: 10.3969/j.issn.1007-7162.2015.04.026

• 综合研究 • 上一篇    下一篇

融合AP和GMM的说话人识别方法研究

王波,钟映春,陈俊彬   

  1. 广东工业大学 自动化学院, 广东 广州 510006
  • 收稿日期:2014-11-26 出版日期:2015-12-04 发布日期:2015-12-04
  • 作者简介:王波(1989-),男,硕士研究生,主要研究方向为图像理解.
  • 基金资助:

    广东省科技计划项目(2010A030500006 )

Research on Speaker Recognition Based on Both AP and GMM

Wang Bo, Zhong Ying-chun, Chen Jun-bin   

  1. School of Automation, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2014-11-26 Online:2015-12-04 Published:2015-12-04

摘要: 针对在说话人识别过程中经典的高斯混合模型(Gaussian Mixture Model,GMM)阶数的确定具有很大随意性的问题,提出采用吸引子传播聚类方法(AP聚类)自动获取GMM的阶数,进而实现说话人识别的方法.首先,采用Mel频率倒谱系数法(MFCC)与差分倒谱相结合的方法,提取语音特征参数;其次,采用吸引子传播聚类方法(AP聚类)对语音特征参数进行聚类处理,从而自动获得GMM的阶数;在此基础上进行GMM模型的训练;最后,采用训练好的GMM模型对Timit标准语音库以及自制网络志愿者语音库进行说话人识别测试实验.实验结果为:使用了AP聚类算法获取GMM阶数的情况下,对Timit标准语音库的测试结果为100%;在自制网络志愿者语音库中,训练样本为168个,其中潮汕话样本10个,湖南话样本10个,测试样本为42个,测试结果为97.6%.实验结果表明,引入AP聚类自动获取GMM的阶数,可以显著提高说话人识别的精度和效率.

关键词: 说话人识别;  MFCC; AP聚类算法; 高斯混合模型

Abstract: According to the randomness of determining the order of the classical Gaussian Mixture Model (GMM), affinity propagation(AP) clustering is recommended to get the order of GMM automatically. A method is proposed to recognize the speakers by applying both AP and GMM. Firstly, the speech feature parameters are extracted by combining the Mel frequency cepstrum coefficient (MFCC) with the differential cepstrum. Secondly, the affinity propagation clustering (AP clustering) method is used as the clustering of the speech feature parameters, and then the best steps of GMM are obtained automatically. On this basis, GMM model is trained. Finally, the trained GMM is used for recognizing experiment of speakers on Timit standard speech library and self-made network volunteers’ speech library. The experiment results are: the test results are 100% on Timit standard speech library and 97.6% on self-made network volunteers’ speech library in case of obtaining the order of GMM by AP clustering algorithm. There are 168 samples for training which contain 10 Chaoshan samples and 10 Hunan samples and 42 samples for testing on self-made network volunteers’ speech library. The experiment results show that the recommended AP clustering algorithm to get the order of GMM automatically can improve the accuracy and efficiency of speaker recognition significantly.

Key words: speaker recognition; mel frequency cepstrum coefficient (MFCC); affinity propagation (AP) clustering algorithm; Gaussian mixture model (GMM)

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!