Font Size: a A A

Chinese Word Segmentation Analysis Based On Bidirectional LSTMN Recurrent Neural Network

Posted on:2017-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Y HuangFull Text:PDF
GTID:2308330485463631Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Before 2002, the method of word segmentation that using an algorithm to implement is essentially based on a dictionary and thesaurus matching. In 2002, the first paper base on the word label to segmentation is published. For the first time, the Chinese word segmentation is abstracted as the problem of sequence generation sequence. Subsequently the word segmentation system that implemented based on word label on some models has achieved good results, such as:MEM, HMM, CRFM, SVMM etc. Current mainstream word segmentation system uses the conditional random fields model.In 2006, the concept of deep learning was proposed, subsequently applied to the field of computer vision, Natural Language Processing, speech recognition, and successfully made many breakthroughs, the recurrent neural network in deep learning is widely used to solve the problem of part of speech tagging, translation, named entity recognition and so on. Abstract most Natural Language processing problem as a sequence generation sequence and use the appropriate structure of recurrent neural network processing it become the current hot and mainstream.Word segmentation based on word label is essentially a problem of sequence generation sequence. In this paper, we propose using the improved bidirectional long short-term memory neural network for Chinese word segmentation. Difference between the improved long short-term memory unit and the standard long short-term memory unit is that we embed the memory tape within the unit to save the past information and rational use of them by the attention mechanism. Avoiding information compression that is caused by the backward passing of a hidden state. Standard long short-term memory neural network can well solve the word and the word between long-range dependency, and bidirectional long short-term memory neural network can capture the context information to a word in the sentence, thus network structure can better comprehend the meaning then correct implementation of word segmentation. Also proposed a standard bidirectional long short-term memory neural network plus attention mechanism for Chinese Word Segmentation, Study the effect of Chinese word segmentation, by adding the attention mechanism at different positions.
Keywords/Search Tags:Deep learning, long short-term memory neural network, Chinese word segmentation, attention mechanism
PDF Full Text Request
Related items