Font Size: a A A

Research On Emotion Recognition Technology Of Tibetan Speech By Fusion Of Multiple Features

Posted on:2024-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:M Z X PengFull Text:PDF
GTID:2555307067468324Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Speech not only carries semantic information,but also carries rich emotional information.Emotion plays a very important role in human-computer interaction.Speech Emotion Recognition(SER)refers to extracting the acoustic features that express emotions from speech signals and determining the corresponding relationship between these acoustic features and human emotions.The phonetic emotion recognition technology of Chinese,English and other languages has achieved remarkable results,Tibetan phonetic speech emotion analysis technology is in its infancy.This paper studies the Tibetan speech emotion recognition technology from the aspects of the features of emotional speech database,the feature extraction and analysis of emotional speech,the features of speech emotion recognition model and the realization of system design.(1)In terms of the construction of emotional speech database,this article designs the construction scheme of Tibetan emotional speech database by analyzing and comparing the classification of emotion types and database construction methods of Chinese,English and other languages.The scheme includes emotional classification,emotional speech collection,emotional speech annotation and validity analysis of Tibetan speech.According to the construction scheme of the Tibetan emotional speech database,the emotional type set(TESCS-9)for Tibetan speech emotional analysis was established,the Tibetan emotional speech database(TESDB-2745)was constructed by recording method and editing method,and the effectiveness of the emotional speech database was evaluated by improved fuzzy comprehensive evaluation method,which laid the foundation for Tibetan phonetic emotional analysis.(2)In terms of feature extraction and analysis of emotional speech,in order to reveal the relationship between Tibetan emotional phonetic prosody characteristics and emotional states,this article takes the Tibetan emotional speech database(TESDB-2745)as the research object.The prosody features of 9 emotion types,such as happiness,anger,sadness,fear,disgust,surprise,neutral,exaggeration and anxiety,were extracted from 2745 Tibetan emotional speech.The distribution of prosody features of emotional speech in Tibetan was analyzed,and the relationship between prosody features of emotional speech and emotional states was analyzed,so as to provide theoretical data for emotional analysis of Tibetan speech.(3)In terms of the construction and system development of speech emotional recognition model,through comparative analysis of the performance of traditional machine learning models commonly used in speech emotional recognition in Tibetan speech emotional recognition in Tibetan speech emotional recognition tasks,aiming at the problem of low accuracy of a single feature in emotional recognition,integrating time,base frequency and base frequency tuning.with multiple features such as offset,zero rate,energy,amplitude,Mel frequency,MFCC,chromatography,spectrometer mass heart,spectral flatness and spectrum contrast,the Tibetan speech emotional recognition model integrating multiple features was constituted with LSTM as the framework,and the Tibetan speech emotional recognition visualisation system with multiple features was designed and realized,and experimentally verified the effectiveness of the model.
Keywords/Search Tags:Speech signal processing, Speech emotion recognition, Emotional speech database, Emotional features
PDF Full Text Request
Related items