Research Of IV-PTN Hybrid System Based On Dynamic Weights For Language Recognition

Posted on:2020-06-27

Degree:Master

Type:Thesis

Country:China

Candidate:J Lin

Full Text:PDF

GTID:2428330590978620

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Language recognition is a technology that automatically recognizes the language of speech data.It is a front end of speech recognition and other related applications,as well as an important branch of voice technology.Similar to speaker recognition,language is also a speech feature and thus can be characterized in a specific way.In order to meet the high requirements of the current language recognition system for less response time,this paper proposes a dynamic weight-based IV-PTN fusion system to improve the performance of short-term speech while maintaining good performance in long-term speech.In this way,the system response time is reduced.This paper first introduces several language recognition methods.According to language characteristics,we can recognize a language with its acoustic features,phoneme features,prosodic features etc.;as for technical characteristics,methods include probabilistic models,neural networks models.Besides,this paper specifically emphasizes on the PTN（Phonetic Temporal Neural）language recognition system based on neural networks and the Ivector（Identity vector）language recognition system based on probabilistic model,considering industrial application trends in recent years.To enhance robustness of Ivector system,this paper combines Ivector system and PTN system and build a new IV-PTN fusion system,which results in better performance under short-term speech conditions.After visualizing different misjudgment features of systems with T-SNE algorithm,we noticed that the domains of high-frequency misjudgment between Ivector system and PTN system do not match.Therefore,we added a dynamic weight module to the back end of the fusion system.The module can automatically assign different fusion weights according to the misjudgment probabilities of different subsystems over a particular segment of speech,and thus,we achieve better performance for system fusion.AP16-OL7 and AP17-OL3,which contains ten different languages respectively,are used as data sets in the experiments,while₍(6(6(6(6(6（6）,EER,ER and DET are used as evaluation indicators.We discussed the PTN and Ivector language recognition methods,and build IV-PTN fusion systems with fixed weights and with dynamic weights based on the two methods.Experimental results show that the fusion system with fixed weights is of better performance than a single subsystem.Compared with fixed-weights systems and systems that model score vectors with dynamic weights,the model proposed in this paper,which models the misjudgment features with dynamic weights,can achieve better results.

Keywords/Search Tags:

Language Recognition, Probabilistic Model, Ivector, Neural Network, Hybrid System, Dynamic Weight

PDF Full Text Request

Related items

1	Language Recognition System Based On Bottleneck Features
2	Feature Comparison And System Improvement In Multilingual Recognition
3	Based On The Design Of Small-vocabulary Speech Recognition System And Speech Recognition
4	Researching Of The Mogolian Language Model Based On Speech Recognition
5	Research On Chinese Letter Sign Language Recognition Based On Skeleton Features
6	A Split-Based Convolution And Dynamic Resolution For Light-Weight Neural Networks
7	Research On Continuous Action Recognition Based On Combining Deep Network And Probabilistic Graphical Model
8	Deep Learning Based Spoken Language Identification
9	Research On Deep Learning Based Far-Filed Speech Recognition
10	Recurrent Neural Network Language Model For Continuous Speech Recognition