Spoken Language Identification Using Deep Belief Network

Posted on:2017-05-28

Degree:Master

Type:Thesis

Country:China

Candidate:J J He

Full Text:PDF

GTID:2348330518494761

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

Language is the most important way for human communication.With the development of economic globalization,communications between people are more and more frequent and close.However,language diversity has become an obstacle for communicate.Meanwhile,with the high-speed development of Internet and multimedia technology,there emerged a large number of audio and video data,and breaking through the language communication barriers becomes the key to obtain the effective information.Computer speech processing,which is in the study identification technology,such as natural language understanding and speaker identification technology,become a hotspot in the field of speech research,and has important meaning in the fields of speech recognition processing,multilingual information service system and military security.Target language identification mainly detects voice messages containing the target languages,which is a front-to-end part in the speech recognition technology,and plays an important role in speech processing.However,poor system robustness and low recognition rate of short speech have been the difficulties of language identification system.This paper intends to apply deep learning in the language identification to improve the shortage of the language identification.The main work and innovations obtained in this paper can be summarized as follows:(1)Improve a traditional target language identification method based on hidden Markov model(HMM).The input feature vector obtains MFCC,spectrum bandwidth,pitch standard deviation,silence rate,sub-band energy distribution,zero crossing rate and low frequency energy rate.After training HMMs,the unknown language audio scores are got by using the Viterbi algorithm.This paper increase the normalization method of nonlinear mapping upon the judgment,the identification rate,Comparing with directly scores judgment,increased by 2%.(2)Propose a method of target language identification based on deep belief network(DBN).This paper have described how to apply RBM and DBN to language identification for various speech segment durations set and experimented on various of acoustic features with MFCC.The DNN structure using features extracted from DBN has a state-of-the-art performance on the language identification task.The MFCC features represented by DBN provide robust representations which are quite useful for the LID task by themselves.The DBN acts as a bridge spanning between a purely acoustic feature input and a purely phonotactic classification.When the MFCC features are used in conjunction with acoustic features,significant improvements are obtained.For similar languages,DBN system shows some limitations.

Keywords/Search Tags:

target language identification, hidden Markov model, restricted Boltzmann machine, deep belief network, multi-feature fusion

PDF Full Text Request

Related items

1	The Improvement Of Restricted Boltzmann Machine And Its Application
2	Deep Learning Models And Applications Based On The Restricted Boltzmann Machine
3	Research On Medical Image Classification Method Based On Restricted Boltzmann Machine
4	Regression And Prediction Of Interval-valued Model Based On Restricted Boltzmann Machine
5	Research On Micro-Learning User Type Identification Based On Improved Deep Belief Network
6	Research On Music Classification Algorithm Based On Deep Belief Network And Hidden Markov Model
7	Research On Terrain Classification For Robots Based On Restricted Boltzmann Machine
8	Research On Shape Model Based On Deep Learning And Image Segmentation Application
9	Research On 3D Target Recognition Method Based On Feature Layer Fusion
10	Research On Parallel Sparse Deep Belief Networks