Font Size: a A A

Research On Multilingual Speech Parameter Extraction And Statistical Feature Recognition

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z R ZhaoFull Text:PDF
GTID:2518306200453124Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Language has a status and role that cannot be ignored in today's society.Language is spread among people in the form of words and sounds.It is a bridge for communication and communication between people.However,interlingual languages have also become an obstacle for most people to communicate.Faced with so many languages in the world,automatic language identification is particularly important.automatic language identification has brought security and convenience to people's information life in the front-end processing of cross-language communication systems,multilingual information service systems,and automatic machine translation.Therefore,automatic language identification has become a key technology and research hotspot of speech signal processing.Feature extraction is a crucial link in language identification and a difficult point in speech signal processing.Selecting appropriate feature parameters in complex and noisy environments can effectively improve the accuracy of language identification.At present,the commonly used feature parameters include linear cepstrum prediction coefficient(LPCC)and Mel frequency cepstrum coefficient(MFCC),but these features have poor anti-noise performance and poor identification performance in complex environments.This paper proposes an improved feature set and an improved language identification algorithm based on the Gammatone filter for the above problems,and performs language identification based on the gaussian mixture model(GMM).First of all,for the language identification of the improved feature set,a system framework based on the gaussian mixture model-universal background model(GMM-UBM)is needed.The system is mainly divided into three modules: feature extraction,establishing a feature set of multilingual speech signals,segmenting syllables according to the syllable segmentation algorithm and counting the length of each syllable,and extracting the pitch period using cepstrum,and then extracting algorithms based on MFCC feature parameters Extract the traditional MFCC feature coefficients,build an improved feature set based on the syllable length of the MFCC and an improved feature set based on the pitch period of the MFCC on the basis of the MFCC;a model training module for the training of the Gaussian mixture model,based on the MLE criterion,using EM The algorithm estimates the model parameters to obtain the general background model and the acoustic model adaptive to each language.In the test module,the test section uses the maximum likelihood value to obtain the final identification result.Secondly,the improved language identification algorithm based on Gammatone filter is also based on the identification system framework of GMM-UBM model.The feature extraction module proposes GTCC feature parameters based on Gammatone filter.The improved feature set of the syllable length of GTCC and the improved feature set based on the pitch period of GTCC are then tested for language identification based on these feature sets.Then all the feature sets are language recognized in different SNR environments,and their identification performance is analyzed and compared.In the end,all the feature sets are programmed in the GMM-UBM model for language identification.The identification accuracy rate was obtained using real broadcast voice signal tests,indicating that the method proposed in this paper has good results.
Keywords/Search Tags:Language identification, Feature set, Gaussian mixture model, Gammatone filter
PDF Full Text Request
Related items