Font Size: a A A

Researches On Sequence Labeling Models In Natural Language Processing

Posted on:2013-05-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:F JiFull Text:PDF
GTID:1228330395951180Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet in recent years, more and more people hope that computers can understand natural languages and help them to improve the efficiency of the daily works. Therefore, natural language processing becomes one of the hottest research topics.In the field of natural language processing, sequence labeling model is very popular, and widely applied to solve many problems, such as chunking, part-of-speech tagging and so on. Different from traditional classification problems, the output of sequence labeling model is a label sequence with structured information. Generally, labels in a sequence are interrelated. With the help of these structured information, sequence label-ing model could achieve better performance than the traditional classification methods.In this paper, we focus on solving complex sequence labeling problems in natural language processing, and mainly improve the labeling models from two aspects.First, we propose a coupled sequence labeling model for solving complex label-ing problems, which can be decomposed into two fundamental labeling problems. In this model, there are two interactive Markov chains. In order to simultaneously find two optimal label sequences of this model, we also propose an exact decoding algo-rithm. Meanwhile, according to the requirements of different applications, we can use these two chains to build up an actual labeling model. In order to adapt to different de-coding algorithms, we also propose a new parameter learning algorithm by leveraging heterogenous corpora. Experimental results on different corpora show that our model outperforms the other methods while finishes various labeling tasks.Secondly, we propose an exact decoding algorithm for solving any high-order Viter-bi decoding problems. By extending the concept of states in the decoding process, we can uniformly transform the high order label tagging problem into a first order state tag-ging problem. According to the constraints of the transitions between states, we encode each state with an unique identifier. Making use of this identifier, we can quickly find its valid previous state and achieve the purpose of pruning the state searching space. Experimental results on different corpora demonstrated that this unified algorithm can improve the performance by increasing the order number without any change of algo-rithm implementation.
Keywords/Search Tags:Structured learning, Sequence Labeling, Coupled Sequences Label-ing, High-Order Viterbi Algorithm
PDF Full Text Request
Related items