Font Size: a A A

Research On Tibetan Speech Emotion Recognition Method

Posted on:2020-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:R L Z CiFull Text:PDF
GTID:2428330599952151Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of natural language processing technology and the deepening of research work,Tibetan information processing technology has entered the era of natural language processing.The research focus of Tibetan information processing technology has shifted from "word" research to language and speech processing layer.At present,key technologies such as Tibetan automatic word segmentation,part-of-speech tagging,semantic comprehension,Tibetan information retrieval,Tibetan machine translation,Tibetan speech recognition and Tibetan speech emotion recognition have become the research fields of scholars,among which Tibetan speech recognition,especially Tibetan speech emotion recognition is obtaining more and more attention.The objective reasons such as late start of speech emotion recognition and immature technology make the field more technical blanks,lack of research results,more social needs,and more research value.Therefore,Tibetan speech emotion recognition technology has become a research hotspot in the field of Tibetan speech information processing.The main contents of this paper are as follows:1.This paper analyzes the research status and trends of Chinese and English speech recognition and speech emotion recognition technology at home and abroad through the research on the speech recognition and speech emotion recognition technologies in Chinese and English at home and abroad.And then combines the characteristics of Tibetan itself to propose the source and the significance of the topic.2.Learning and researching speech recognition technology,This paper summarize three key technologies of speech recognition template matching,acoustic-linguistic model and artificial neural network,It also introduces the corpus construction process,the pronunciation dictionary creation process,the feature parameter extraction method and the algorithm involved in speech recognition,which mainly introduces Hidden Markov Model,Baum-Welch algorithm,EM algorithm,Gaussian Mixture Model and DTW algorithm.3.By learning Tibetan linguistics knowledge and combining the physiological the basis and physical properties of Tibetan phonetics,this paper introduces in detail the four major elements of the pitch,intensity,length and timbre of Tibetan speech,as well as the pronunciation characteristics of Tibetan phonetics.4.Collection and organize of the corpora.This paper uses Python language web crawler and manual collection method to complete the collection of more than 50,000 Tibetan sentence text corpora,and the collected corpus is pre-processed by machine and manual proofreading method(for word segmentation and samples labeling)),so that more than 1,000 Tibetan sentences in the original corpus text can be used as sample corpus.On this basis,the paper uses Cool Edit Pro2.0 recording software as an auxiliary tool for speech acquisition to record text,completes more than 1000 Tibetan sentence recording,and preprocess the recorded speech data(noise reduction,labeling)to make more than 1,000 speech data of the initial speech corpus into available speech sample data.5.This paper introduces the Tibetan speech recognition technology,and uses matlab to analyze the speech signal of each emotion category speech data.Then it proposes a method for Tibetan speech feature extraction by using the 1Mel frequency of the Mel Frequency Cepstral Coefficient(MFCC)and the Mel filter bank for Tibetan speech feature extraction.6.This paper puts forward the method of selecting emotional features of Tibetan sentences from the aspects of part of speech,emotional words,turning words,negative words and degree adverbs,and uses this method to classify seven emotions(happy,love,angry,sad,fearful,disgust,shocked).According to the experimental results obtained from the Tibetan emotional classification,the overall average accuracy rate is 76%,the recall rate is 75%,and the F value is 75%.To some extent,it can be applied to the emotion classification of Tibetan.Although the related research of Tibetan speech emotion recognition technology is still in its infancy,and the research results are few,the exploration of Tibetan speech emotion recognition method in this paper can lay a foundation for the research in this field,and will play a role in boosting the flames.
Keywords/Search Tags:Tibetan speech recognition, feature extraction, Hidden Markov Model, Mel frequency, emotion classification
PDF Full Text Request
Related items