Font Size: a A A

Research On Chinese Speech Recognition And Emotion Recognition Based On Neural Network

Posted on:2022-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2518306488977129Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Language and voice are the most important and direct ways of human communication,and they play an irreplaceable role in daily life.They not only contain what they want to express,but also contain rich emotions.With the development of neural networks and the continuous advancement of artificial intelligence technology,people's requirements for speech recognition and emotion recognition are getting higher and higher,which has given birth to a series of research and development for speech recognition and emotion recognition technology.As our mother tongue,Chinese is the most spoken language in the world.There are not only many synonyms and homophones in Chinese,but also consonants and tones.The recognition process is complicated and difficult.Therefore,the effect of Chinese speech recognition and emotion recognition is not ideal.As a hot spot that everyone likes and pays most attention to in recent years,neural networks have achieved amazing results in many fields such as speech recognition and emotion recognition.Therefore,this paper takes Chinese speech recognition and emotion recognition as the research objects,and builds models based on neural network to recognize Chinese speech.The main tasks completed in this paper include:(1)In order to study Chinese speech recognition,based on the latest Wave Net model launched by Google and combined with CTC loss,a Wave Net-CTC model was constructed and trained on the Thchs-30 speech data set.The experimental results show that when using this model for training,the recognition rate obtained is 85.0%,which verifies the feasibility of the Wave Net-CTC model.(2)In order to better recognize the emotion type in the speech signal,the mutation seagull algorithm optimized BP neural network(VSOA-BP)is proposed.The algorithm uses the genetic algorithm's mutation idea,and reinitializes the variables with a certain probability by introducing the mutation operation,thereby improving the accuracy of the algorithm and avoiding falling into the local optimum.Finally,the improved algorithm is used to optimize the weights and thresholds of the BP neural network and apply it to the recognition of speech emotion.The experimental results show that in the comparison of PSO-BP,VPSO-BP,SOA-BP and VSOA-BP four networks,the VSOA-BP network can more effectively improve the speech emotion recognition rate,and accelerate the convergence speed of the network,which shows that the improved algorithm is feasible and effective.(3)Through the research and expansion of Chinese speech recognition technology,an intelligent Chinese speech system is designed from the practical application.The system mainly includes two functions: speech recognition and speech synthesis.Free switching function in two modes of speech synthesis.
Keywords/Search Tags:WaveNet-CTC Model, Speech Recognition, Mutated Seagull Algorithm, Neural Network, Emotion Recognition, Intelligent Chinese Voice System
PDF Full Text Request
Related items