Font Size: a A A

Research On Chinese Tongue Ultrasound Video Conversion Based On DTW And CNN

Posted on:2020-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z YanFull Text:PDF
GTID:2518306518463554Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Mandarin Chinese is the universal language of China.According to statistics,as of 2018,more than 160 million foreigners in the world are learning Chinese and using Chinese as a necessary skill for their life and work.Compared with this,the growth of the number of professional Chinese teachers engaged in relevant directions is far from meeting the needs of the international community for this position.On the other hand,there are also a large number of language-disabled people in the world.It is very difficult for ordinary people to have a simple dialogue.A large part of them are due to tongue movement disorders and other reasons,which lead to irregular pronunciation and abnormal pronunciation.How to better meet the above requirements with the help of computers is a very meaningful and promising topic for people.In response to the above problems,this paper proposes a tongue motion synthesis method based on tongue ultrasound.With the help of a built-up tongue ultrasound database,the user can visualize the tongue movement state of the mouth during Chinese pronunciation.The established tongue ultrasound database is used to visualize the tongue movement state inside the mouth when the user pronounces Chinese.The method uses a microphone to record pronunciation or a recorded audio file as input,through a series of algorithms,including speech preprocessing,semantic recognition,segment cutting,semantic segment matching alignment,generating smooth ultrasound images,ultrasound video synthesis,and finally generating coherent visualization tongue motion ultrasound video.The main work of this thesis includes: designing the corpus,accepting the tongue ultrasound image data of Chinese pronunciation,and establishing the mapping relationship between tongue movement and speech features for Chinese characters through semi-automatic cutting and calibration of speech data and ultrasonic image marking.Based on the characteristics of the ultrasound image,the dynamic time warping method(DTW)is used to generate the ultrasound image and the convolutional neural network(CNN)is applied to the tongue motion ultrasound video smoothing process.It provides feasible solutions and ideas for the video synthesis system of Chinese tongue motion ultrasound images,which can assist users in learning Chinese pronunciation,correcting tongue movement,etc.,and can also be used for subsequent research such as synthetic virtual talk heads and humanoid robots.The Chinese pronunciation and so on provide guidance for tongue movement.Provides realtime tongue movement simulation data when anthropomorphic robots are pronounced,improving robot interaction and realism.
Keywords/Search Tags:Ultrasound image database, Deep learning, Chinese, Ultrasonic processing, Video synthesis
PDF Full Text Request
Related items