Font Size: a A A

Research And Application Of Pronunciation Detection For Deaf Children Rehabilitation

Posted on:2012-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Y HaoFull Text:PDF
GTID:2178330335450299Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Speech training is an important aspect of early childhood education, however, many children delay in early speech acquisition because of hearing impaired, even loss of speak function. Traditional treatment for deaf children speech rehabilitation is through special teachers'gesture, mouth, etc. to train, and it is difficult, inefficient and lacking of enough teachers, this makes speech rehabilitation therapy for deaf children become a serious social problem. Recent years, with the development of computer multimedia technology and the improvement of speech technology, computer-based speech training system (CBST) becomes more and more popular, more and more trustworthy. Neverthless, systematic training system which specifically for Chinese is still very rare, take the speech rehabilitation department of Guangdong Disabled Rehabilitation Center for example, although there are a variety of rehabilitation technologys and equipments, however, due to device complexity and fewer teachers, some softwares and equipments do not fully play their role.This paper analyzes the domestic and international research status of speech rehabilitation system, and the advantages and disadvantages of rehabilitation softwares that are used in present. We design a system of interactive speech rehabilitation for deaf children based on user's actual need. This system is able to use 3D virtual head to make movements training simulation of Mandarin Chinese pronunciation and tongue speaking, show real dynamic pronunciation process. At the same time, it is able to make use of speech signal processing and automatic speech recognition technology to detect the pronunciation of deaf children and make evaluation automatically. This system includes five modules in general: content visualization of speech rehabilitation, real-time auditory training and feedback, real-time pronunciation training and feedback, speech visualization of 3D virtual head and real-time face display of speaker. In which, the speech detection technology in pronunciation training module is the research focus of this paper.Previous auto-detection technology for Chinese pronunciation mostly takes normal audio as processing objects, while our method both illustrates to deaf and normal speech. From the perspective of irregularity of the special speech, also weak background noise and real-time in meantime, respectively, we take research on voice activity detection (VAD). duration calculation. volume computation. fundamental frequency extraction, voiced/unvoiced discrimination and tone recognition in depth. Specifical for the above 6 methods, this paper selects a VAD method which is based on adjacent frame difference energy and mean energy information. Choose mean energy and zcr which can reflect endpoint information as feature parameters, and implements duration calculation accordingly. Put logarithmic energy which can reflect human auditory characteristics and distinguish loudness level effectivly into volume computation. NCFF calculation, peak extraction and smooth post-processing construct the most important parts of fundamental frequency extraction, this paper proposes a five-point peak extraction method that is based on intra-frame local information, and tries a new pitch contour smoothing process, so effectively it can deal with twice-frequency. semi-frequency point and random outlier, and much more excellent smoothing effect than any other methods. As for monosyllabic words, this paper also proposes to reversly take frequency in voiced/unvoiced discrimination, furthermore, we make some improvement on tone recognition by using frequency curve and sub-energy character.Normal adult audio database and deaf children audio database are established respectively in this paper, and according to that, we complete statistical and comparative experiments between Matlab and Wavesurfer. Experiments show that. VAD and volume calculation methods have very satisfied effect; duration value exists a slight error in allowable range; the detection result and stability of fundamental frequency extraction is up to or even beyond Wavesurfer; on normal audio database, accuracy rate of voiced/unvoiced discrimination is 91.65%, and tone recognition rate is 88.60%, on deaf children audio database, accuracy rate of voiced/unvoiced identification is 84.24%, and tone recognition rate is 73.52%. Moreover, this thesis briefly describes the design and development of the interactive language and rehabilitation system, and its trail use and evaluation in Shenzhen Hi-Tech Fair.
Keywords/Search Tags:Speech Rehabilitation System, Speech Training, 3D Virtual Talking Head, Speech Visualization, Pronunciation Detection, Speech Signal Processing, Speech Recognition
PDF Full Text Request
Related items