Font Size: a A A

Research On The Recognition And Conversion System Of Speech

Posted on:2015-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhaoFull Text:PDF
GTID:2298330467451379Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years, with the vigorous development of signal processing, artificial intelligence, Internet technology, various multimedia applications gradually come into all aspects of people’s life. Image, video and other related technologies have been applied in actual scene, such as face recognition, fingerprint recognition, video event detection,3D movies, etc., and make people’s life become increasingly colorful.Speech related technology is obviously lagging behind, especially in the processing of speech’s personality characteristics. This mainly owes to the great amount of data of speech signal, and the difficulty to modeling. With more attention paid to the speech processing technology, numerous domestic and international scholars published a large number of relevant research achievements. Until now, only a small amount of application can be applied to practice and obtain good effect. This technology remains further research.This paper expounds the relevant technologies of the speech signal process, especially the method of extraction, modeling, identification and transformation of speech’s personality characteristics. And then this paper puts forward a series of methods, mainly including the following aspects:1) Proposes a new precision, large-scale, anti-noise method for revise of speech fundamental frequency, experimental results show that this method can precisely correct the fundamental frequency under large estimated deviation and low SNR. And then proposes an improved method to extract anti-periodic-interference voice spectrum, this method can eliminate the periodic interference with very short time-domain signal.2) Models the speech characteristics with GMM, and proposes a recognition method based on GMM’s proportion vector. And also, microstructure of speech spectral envelope and shaking of the fundamental frequency track are modeled for comprehensive identification, experimental results indicate that this algorithm has high recognition efficiency.3) Proposes a transformation method based on GMM independent modeling, greatly reducing the computational complexity of training’and transformation. A voice conversion system using the above methods is proposed, and good conversion results are gained.
Keywords/Search Tags:speech personality character, Analysis/synthesis model, spectralenvelope, fundamental frequency track
PDF Full Text Request
Related items