Font Size: a A A

Research On Models For Determining The Pronunciation Of Shujuan Zi Incorporating The Glyph And Semantic Information Of Chinese Characters

Posted on:2020-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z J MengFull Text:PDF
GTID:2415330575464611Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Shujuan Zi(书卷字)refers to the word that is iot pronounced in the spoken language of modern Chinese mandarin.Most of the them have been put on contemporary pronunciation.At present,the pronunciation of the Shujuan Zi depends on manual determining.The factors considered are relatively simple,and it is also susceptible to the subjective factors of the finalist.In addition,through our research,it is found that the differences in the contemporary pronunciation is much more than that estimated by the existing research institutes.Relying on manual determining is a huge workload.In view of the above reasons,we will use the rhyme data,incorporating the glyph and semantic information of Chinese characters,using statistical methods and artificial intelligence technology to study the automatic determination of Shujuan Zi’s contemporary pronunciation,which can overcome the limitations of manual determining to some extent.The basic model of the Shujuan Zi’s candidate pronunciation based on the phonetic transcription and incorporating the glyph and semantic information of Chinese characters are the main tasks of our research.The model of the Shujuan Zi’s candidate pronunciation based on the phonetic transcription is mainly used to determine the pronunciation through the Fanqie(反切)and the variant character information.First,we collect the information of the Fanqie and the variant character through the ancient rhyme data,and then construct the rules of the contemporary pronunciation’s^establishment based on these data,at the same time calculate the probability of the different rules according to the statistical methods.Finally,we will infer the contemporary pronunciation by the probability tables.The shape-sound of word is the main composition of Chinese characters,so the sound-side has important reference significance for the study of Chinese character’s pronunciation.Therefore,we introduce the radical sequence information of Chinese characters,and try to construct a neural network architecture with Attention mechanism to realize the automatic determining of the contemporary pronunciation.We train the model using the Chinese characters which already have the certain pronunciation,and then predict the pronunciation of the tested Chinese character and estimate its probability according to the model.In addition,we also propose a optimization model combined with comprehensive multifaceted information.And calculate the accuracy by automatic learning method,and then infer the contemporary pronunciation according to a variety of factors.This method can overcome the current mistakes caused by few clues and subjective factors to some extent when artificially deducing the contemporary pronunciation.For the Shujuan Zi that cannot obtain the Fanqie,variant character and the sequence of the radicals,we propose a model of the candidate contemporary pronunciation based on the features of Chinese characters.The method mainly uses the Chinese character image feature information to determine the pronunciation.We find the similar word by calculating the similarity between the images feature vectors,regarding the pronunciation of the similar words and their variant character information as the candidate pronunciation.The experimental results show that the method can improve the recall rate.
Keywords/Search Tags:Determining the pronunciation of Shujuan Zi, The glyph and semantic information of Chinese characters, Statistical methodology, Deep learning
PDF Full Text Request
Related items