Font Size: a A A

Realization Of Speech-to-gesture Conversion Based On FPGA

Posted on:2017-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z S BaiFull Text:PDF
GTID:2348330488970884Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Currently, More than 630 million people in the world suffer from great distress at live and learn because of dysaudia. Although existing researches have achieved a gesture to speech conversion, research on speech to gesture conversion is still insufficient. There is a great communication barrier between speech impediments and external world. Therefore the thesis designs and implements a speech to gesture conversion system based on FPGA.Acoustic models of isolated words were firstly trained. At the same time, the gesture images of isolated words were recorded according to the "Chinese sign language". On the basis of above work, the thesis realized a speech-to-gesture conversion on FPGA. The hand gesture images and the trained isolated word acoustic models were stored in SDRAM of FPGA. The input isolated word speech signal was match on acoustic models of each isolated word to output optimal matching result. Finally, the matched isolated word gesture image was displayed on LCD screen of the FPGA. The major works and originalities of the thesis are as follows:Firstly, the thesis implements a hidden Markov model(HMM) based isolated word speech recognition. The speech signal of 20 isolated words are recorded for training the acoustic models by adopting Mel frequency cepstrum coefficients(MFCC) as acoustic features. The Hidden Markov model Toolkits(HTK) is used to achieve the model training and speech recognition. Experimental results show that the realized isolated word recognition can achieve 100% of recognition rate on dependent speaker.Secondly, the thesis records 20 gesture images of isolated word. The gesture images are selected from daily conversation teaching materials of "Chinese sign language", which include 11 Arabic numbers from 0 to 10 and 9 words like praise, friendly, rejection, gratitude,good, contempt, caring, loving as well as poor. 20 isolated word corresponding to the gesture images. The isolated word gesture images are saved as BMP format with a resolution of240x320. These images are used to display the final gesture image of isolated word recognition results on the LCD screen.Thirdly, the thesis realized a FPGA-based speech-to-gesture conversion system. NIOS II is adopted as a soft core processor system to be embedded in the FGPA chip EP4CE115F29C7 N to completing four parts of work including speech signal acquisition,speech decoding and storage unit, speech recognition and pattern matching, as well as gesture image display unit. All modules of system are debugged separately and then the system is debugged by integrating all modules together. In addition, a man-machine interface is designed by combining with the features of SOPC to achieve the speech-to-gesture conversion based on FPGA platform.Finally, the thesis tests the realized system under the different conditions. The running speed of hardware platform is compared with software one. Experimental results show thatFPGA-based hardware platform is faster than the software platform on computing and recognition about 30 times. Two different circumstances are used to test the speech recognition rate. The average recognition rate is 100% for speaker-dependent situation while the average recognition rate is 82.6% for speaker-independent situation. The average recognition rates of speaker-dependent and speaker-independent situation are 88.9% and72.6% respectively under the noise environment.
Keywords/Search Tags:Speech-to-gesture conversion, Speaker recognition, Gesture display, FPGA, Hidden Markov Model
PDF Full Text Request
Related items