Font Size: a A A

Research On Chinese-Braille Translation Of Word Segmentation And Link Writing

Posted on:2019-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:R ZhangFull Text:PDF
GTID:2428330566964612Subject:Engineering, Electronics and Communication Engineering
Abstract/Summary:PDF Full Text Request
Chinese braille is developed based upon Chinese Pinyin,and represented by the braille character.In Chinese texts,there is no explicit boundary between Chinese characters and vocabulary.Yet,words and words are bordered by a blank in Chinese braille.The translation of Chinese text into braille is mainly composed of two steps: Firstly,Chinese texts must be exposed to word segmentation and link writing,and then converted into Pinyin;secondly,Chinese Pinyin shall be translated into braille in accordance with the corresponding relationship between them.Therefore Chinese braille word segmentation and link writing shall be viewed as an important regulation in Chinese braille.The traditional methods of Chinese braille word segmentation and link writing mainly depend on the complex feature engineering and these features need to be extracted from Chinese text artificially.Deep learning methods allow the computer to learn features at multiple levels of abstraction automatically,without completely depending on human-crafted features.In particular,the appearance of word vectors has made deep neural networks easier to train.In this paper,according to the rules of link writing for braille segmentation,a corpus of Chinese braille word segmentation and link writing has been constructed based on the existing standard data set of Chinese segmentation.The author selects a basic framework of Chinese braille word segmentation and link writing task,which is built on BI-LSTM-CRF neural network model.As most of the braille rules are based on part-of-speech features,the existing neural network model considers only the lexeme information in the data,which makes the model unable to fully learn the characteristics of the data.In the paper,the similarity between the words of the same part-of-speech has experienced a substantial enhancement by adding related syntactic characters to distributed word vectors.Besides,the adding of related syntactic characters is used to optimize the input of the neural network,which fortifies the neural network model.The CRF layer can not only make better use of sentencelevel tag information,but take into account the dependencies between the labels before and after the output tags.With the help of the transfer matrix obtained during training,the Viterbi algorithm is used for decoding in the prediction process,which can avoid invalid label combination.In this paper,the experimental results show that the improved BI-LSTM-CRF neural network model has a good performance of Chinese braille word segmentation and link writing.Finally,a Chinese braille segmentation and link writing Web system is built through the improved neural network model algorithm.Users can access the system through the browser and convert Chinese text into a text that complies with the Chinese braille word segmentation and writing rules.
Keywords/Search Tags:Word Segmentation and Link Writing, BI-LSTM-CRF, Deep Learning, Word2vec, Conditional Random Field
PDF Full Text Request
Related items