Font Size: a A A

Spoken Language Identification Using Deep Belief Network

Posted on:2017-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:J J HeFull Text:PDF
GTID:2348330518494761Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Language is the most important way for human communication.With the development of economic globalization,communications between people are more and more frequent and close.However,language diversity has become an obstacle for communicate.Meanwhile,with the high-speed development of Internet and multimedia technology,there emerged a large number of audio and video data,and breaking through the language communication barriers becomes the key to obtain the effective information.Computer speech processing,which is in the study identification technology,such as natural language understanding and speaker identification technology,become a hotspot in the field of speech research,and has important meaning in the fields of speech recognition processing,multilingual information service system and military security.Target language identification mainly detects voice messages containing the target languages,which is a front-to-end part in the speech recognition technology,and plays an important role in speech processing.However,poor system robustness and low recognition rate of short speech have been the difficulties of language identification system.This paper intends to apply deep learning in the language identification to improve the shortage of the language identification.The main work and innovations obtained in this paper can be summarized as follows:(1)Improve a traditional target language identification method based on hidden Markov model(HMM).The input feature vector obtains MFCC,spectrum bandwidth,pitch standard deviation,silence rate,sub-band energy distribution,zero crossing rate and low frequency energy rate.After training HMMs,the unknown language audio scores are got by using the Viterbi algorithm.This paper increase the normalization method of nonlinear mapping upon the judgment,the identification rate,Comparing with directly scores judgment,increased by 2%.(2)Propose a method of target language identification based on deep belief network(DBN).This paper have described how to apply RBM and DBN to language identification for various speech segment durations set and experimented on various of acoustic features with MFCC.The DNN structure using features extracted from DBN has a state-of-the-art performance on the language identification task.The MFCC features represented by DBN provide robust representations which are quite useful for the LID task by themselves.The DBN acts as a bridge spanning between a purely acoustic feature input and a purely phonotactic classification.When the MFCC features are used in conjunction with acoustic features,significant improvements are obtained.For similar languages,DBN system shows some limitations.
Keywords/Search Tags:target language identification, hidden Markov model, restricted Boltzmann machine, deep belief network, multi-feature fusion
PDF Full Text Request
Related items