Font Size: a A A

Research On Beam Search And Neural Network For Chunking

Posted on:2017-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:C ChengFull Text:PDF
GTID:2308330485462280Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Chunking is one of the fundamental tasks in natural language processing which aims at dividing a text into syntactically correlated, structurally more simple, non-overlapping, non-recursive parts of words. Since computer was used to solve the NLP tasks, chunking has become a research problem covering both linguistics and computer science. As it can be a pre-process stage in many tasks such as machine translation, parsing, information retrieval and information extraction, chunking attracts a lot of re-search with its comprehensive application requirements. Although there are many re-lated work, it is still a long-term challenge to completely solve the chunking problem yet.Rule-based approachs are firstly applied to tackle the chunking task and statistical methods become the main methods to deal with the task at present. Traditional methods view the chunking task as a sequence labeling problem, which can be solved using a structure prediction model, such as conditional random field (CRF). However, the markov assumption of this kind of model may hurts the tagging accuracy because it limits the utilization of many useful features when we are tagging the words locally. How to alleviate this limitation is still an open question, although lots of work have been done on it.Taking the limitation above into consideration, we make use of a transition-based model to solve the chunking task. In addition, we apply a neural network which can learn the non-linear relationship of input and output to score every valid transition op-eration when a transition are being made between two states. Specifically, the main works in this paper are as following.First, we build a strong baseline system which combines the transition-based method with the simple feedfoward neural network. It trains the model and decodes the input sentence with greedy strategy. Second, to model a complete labeling sequence of a sentence more globally, we adopt the beam search strategy instead of greedy strategy to decode the input sentence during the labeling procedure and adopt contrastive di-vergence learning to train the model in training. The experiment results show a great improvement over the baseline system after we model the labeling sequence globally with beam search strategy. Finally, because the simple feedforward neural network lacks the ability to model the information about long distant tags which we already have got and long distant words well, we may get a poor sequence score composed by the transition actions’s scores. In order to get a better sequence score by exploiting more long distant information, we combine a more powerful network, namely long short term memory network, instead of feedfoward neural network with the beam search frame-work to score a complete labeling sequence better. The performance improvement on two tasks shows the effectiveness of our method in the experiment.The performance is getting better as we enhance our model from the baseline. The final method which combines beam search strategy with long short term network gets the ever best performance on three other tasks except for a comparable performance with the best in Chinese text chunking task.
Keywords/Search Tags:chunking, sequence labeling, transition-based, beam search, long short- term memory network
PDF Full Text Request
Related items