Research On Three-dimensional Features Recognition Based On Deep Learning Speaker

Posted on:2021-03-03

Degree:Master

Type:Thesis

Country:China

Candidate:J Chen

Full Text:PDF

GTID:2428330611950445

Subject:Information and Communication Engineering

Abstract/Summary:

Speech recognition not only plays an important role in Human-Computer Interaction,Artificial Intelligence(AI),Natural Language Processing(NLP)and other aspects,but also is a current research hotspot.The speaker's three-dimensional features recognition is to analyze the information that represents the gender,age and emotion of speaker through the speaker's voice signal,and to identify the speaker's gender,age and emotion,which is of great practical significance to criminal case investigation,intelligent hospital,intelligent court,for example:to identify the driver's emotional states can be reminded in advance to reduce the occurrence of traffic accidents,in psychological counseling accurate identification of visitors' emotions is conducive to the smooth completion of the consultation process,etc.The traditional classifier Softmax,Support Vector Machine(SVM)and e Xtreme gradient boosting(XGBoost)on individual feature such as speaker's gender,age and emotion is better,and the classification effect of multi-dimensional features above two-dimensional(gender and age)is poor.Multi-modal fusion method is used to fuse two single-modal deep learning models Bi LSTM and CNN as deep feature extraction model(i.e.Bi LSTM＿CNN).the multimodal features fusion method is used to fuse the single-modal time domain feature,frequency domain feature,text feature to obtain features data that can better represent the speaker's speech information.aiming at the low learning ability of deep neural network for a small number of speech samples,this paper proposes to transfer the deep feature extraction model(Bi LSTM＿CNN)depth learned feature knowledge to Softmax、SVM and XGBoost for target task learning.The experiment proves that the proposed model Bi LSTM＿CNN have a good classification effect on the recognition of three-dimensional gender,age and emotion on the target task learning SVM.

Keywords/Search Tags:

Deep learning, Multi-modal fusion, Gender identification, Age recognition, Emotion recognition

Related items

1	Emotion Recognition Based On Multi-modal Information Fusion
2	Multi-modal Emotion Recognition Based On Deep Learning
3	A Study Of Deep Learning Based Multimodal Emotion Recognition
4	Research On Multi-modal Biometric Identification Method Based On Convolutional Neural Network
5	Research On Emotion Recognition Of Monomodal Speech And Multimodal Speech Vision Based On Transfer Learning
6	Research On Multi-modal Emotion Recognition Based On Broad Learning System
7	Research On Multi-Modal Emotion Recognition Based On Deep Learning And Feature Fusion
8	Research On Speech Emotion Recognition Method Based On Multi-feature And Multi-modal Fusion
9	Research Of Multi-Modal Emotion Recognition Based On Deep Learning
10	Research Of Emotion Recognition Based On Multi-modal Fusion