Font Size: a A A

Research Of Embeded Speech Synthesis Technology

Posted on:2012-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:H HuangFull Text:PDF
GTID:2178330338497958Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the constant development of social economic and technology level, the robots are applied in mankind's production activities and social activities more and more. In human-computer interation processing, voice is the most natural communication way for human beings. So, Speech recognition and speech synthesis technology is becoming the research hotspot of human-computer interaction field. At present, the computer-based Chinese speech synthesis is mature, and the quality of the specch synthesizer is high, such as voice E-mail. But, because of the storage capacity and processor speed limitation in embedded systems, the intelligibility and natural degrees of synthesis speech still not high enough. One of the most important reasons is synthesized speech rate not adjustable. According to this problem, this paper has put forward a deep research on the embedded system speech rate control method, so as to improve the naturalness and intelligibility of synthesis speech.Firstly, the paper studyed the basic theory of time domain and frequency domain analysis method for speech signal. Short-time windowing, endpoint detection, short-term average energy, zero crossing rate, autocorrelation function and so on are analysed in time domain. Short-time Fourier transform method, spectrogram and other related issuses are discussed in frequency domain. Then, how to implement endpoint detection estimation, pitch-period estimation and formant estimation are described in detail. The simulation and verification of various algorithms were presented at last in MATALB environment.The final purpose of this research is to realize unmanned automatic interpretation system which uses speech synthesis technology-converted the contents stored in text form into voice signal as output. In order to solve the problem that text-to-speech system, consised of Chinese speech synthesis chips, can not adjust the playrate, a method of using marked special characters, which divided the text into different types of information frame and then transmit the frame to the MCU, are presented on this paper. And then the speech broadcast rate are accommodated by systhem which can automatically judge information frame type and set diffierent delay time according to different information frame. Experiments show that the proposed special characters marked method on this paper not only realize adjusting voice rate at will, but also improves the broadcast specch intelligibility and natural degrees. Compared with the traditional pulse code modulation (PCM) method, the memory capacity can be save 80% at least by using text-to-voice method, which makes voice synthesis is avaliable in embedded systems.On the basis of the speech synthetic technology, this subject has designed and developed an embedded Chinese speech synthesis system. The system has been applied to electronic technology experiment teaching and achieved good application effect through actual application testing.
Keywords/Search Tags:Speech synthesis, linear prediction synthesis, text to speech, automatic interpretation system
PDF Full Text Request
Related items