Speech Based Identification And Emotion Information Extraction And Its Application In Pervasive Computing

Posted on:2008-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:C Wang

Full Text:PDF

GTID:2178360242966108

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

This paper mainly studies the speaker recognition and emotion expression based on the character of ubiquitous service in pervasive computing.The speaker recognition needs to take into account the efficiency and accuracy of recognition but not only the accuracy because of the requirement of time limitation on the real-time monitoring system. Therefore we need to improve the system's operating speed, and mean while to keep the accuracy of recognition. Here we mainly improve the feature extraction and classification algorithm for the system, and then we make some improvement on MFCC feature extraction and propose a quick MFCC algorithm. The proposed algorithm can reach the requirement of real-time system in case of the high precision. To prove it, this paper compares its algorithm with LPC and FFT based on Euclidean Distance classification method. The experiment indicates that the EER of LPC is 14.3% and the EER of FFT is 11.4%, but by using the Quick MFCC the EER is only 4.3%, and the run time of the system is about 4.0s that meet the real-time requirements.Then based on the quick MFCC we use differential MFCC to compare with others relying on the VQ fuse with GMM classification method. The experiment indicates that the EER of LPC is 14.4% and the EER of FFT is 12.5% and the EER of quick MFCC is 9.4% and the EER of differential MFCC is 6.9%.At last, we compare all the classification methods in this paper with the feature extraction algorithm of differential MFCC. Then the EER of Euclidean Distance method is 15% and the EER of VQ is 11.2% and the EER of GMM is 4.4% the EER of VQ fuse with GMM is 6.9%, although the GMM method can get best accuracy of recognition, the run time of it is about 6.0s not as good as the final method which only use about 4.5s to get the result.As to the emotion expression, we mainly use some pitch processing methods to decide the speaker's emotion. Then we use the two methods for the e_Learning system, which can be seen as a ubiquitous service that has the character of "Anytime, Anywhere, Invisible".

Keywords/Search Tags:

MFCC, LPC, FFT, VQ, GMM

PDF Full Text Request

Related items

1	Design of a keyword spotting system using modified cross-correlation in the time and the MFCC domain
2	Research On MFCC And GMM Speech Conversion Technology
3	The Research Of Robust Speech Recognition In Noise Environment Based On MFCC
4	The Study Of Speaker Recognition System Based On MFCC
5	Study Of Speaker Recognition System Based On MFCC And GMM
6	Research And Implementation Of Speech Synthesis Algorithm Based On Improved MFCC
7	Based On The Improved Mfcc Parameters Of The Application Of Speech Recognition System
8	Research On Speech Emotion Recognition Algorithm
9	MFCC Feature Extraction Research Based On ICA And Its Implementation On DSP
10	Research Of The Bad Audio Detection Based On MFCC