Font Size: a A A

Research On Joint Learning Of Sequence Labeling In Natural Language Processing

Posted on:2015-03-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:1268330422992443Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Sequence labeling is one of the fundamental problems in natural language process-ing. In this thesis, we classify it into two categories: single sequence labeling problem(SSLP), which predicts one output label sequence; multiple sequence labeling problems(MSLPs), which predict several output label sequences. The cascaded approach treatsMSLPs as multiple SSLPs, and processes these SSLPs in pipeline. However, this ap-proach has the drawbacks that error propagation and the lack of information sharing areamong these multiple SSLPs. Joint learning approach can overcome these drawbacksby jointly processing multiple SSLPs in one model or in one framework. It efectivelyenhances information exchange among these SSLPs, and improves their prediction per-formance. This thesis discusses diferent types of sequence labeling problems, and studieson single sequence labeling approaches and joint learning approaches. Our main researchtopics include:1. Traditional sequence labeling approaches use neighboring information of theinput unit as features and usually lack global information, which tend to mistakenly anno-tate the unit. A cascaded reranking approach with global information fusion is proposedto solve this problem. For SSLP, the cascaded reranking approach brings several mod-els with sequence global information and syntactic information. First, a linear rerankingapproach is used to combine these models. Second, a structured perceptron rerankingapproach uses features extracted from these models to build the reranking model. Finally,the linear reranking approach and the structured perceptron reranking approach are cas-caded to choose the optimal output label sequence. For MSLPs, the cascaded rerankingjoint learning approach can employ global information in each SSLP and combination in-formation among these SSLPs. Experimental results show that the cascaded reranking ap-proach improves the recognition accuracy on Chinese pinyin-to-character conversion andMandarin speech recognition by incorporating part-of-speech and syntactic information,and the cascaded reranking joint learning approach outperforms the cascaded approachand the tag combination approach on English part-of-speech tagging and chunking.2. Compared with a single learning approach, joint decoding approach can integratediferent models in decoding phase, and improve the prediction performance. The the-sis proposes supervised and semi-supervised joint decoding approaches for MSLPs. The supervised joint decoding approach integrates diferent models with linear weights in de-coding phase, and the semi-supervised joint decoding approach selects the text annotatedsame by two models as new training sentences. Then the joint decoding approaches areused for Chinese word segmentation and part-of-speech tagging. Experimental resultsshow that the supervised joint decoding approach outperforms other single supervised ap-proaches, and the semi-supervised joint decoding approach outperforms the state-of-artsupervised approaches and semi-supervised approaches.3. The cascaded reranking joint learning approach and joint decoding approach can’tbe applied for MSLPs with inconsistent training data. An iterative joint learning approachis proposed to solve this problem, which allows each SSLP in MSLPs to share informa-tion with other SSLPs through feature propagation. In the iterative step, each problemuses a structured perceptron based ensemble method to combine the models using basicinformation and the models using the information in other problems. Experimental re-sults on English part-of-speech tagging and chunking, Chinese word segmentation andpart-of-speech tagging&named entity recognition show that the iterative joint learningapproach achieves a better performance than the pipeline approach, the tag combinationapproach and other ensemble methods.4. Traditional approaches for Chinese sequence labeling problems utilize discretelinguistic information as features. However, the scale of training model is greatly in-creased during the training process, and features for diferent problems need to be manu-ally chosen and tuned on development data. To solve this problem, a deep neural networkmodel with word boundary based character representation is proposed, and then appliedon Chinese single sequence labeling problem. In the character representation layer of thedeep neural network model, each Chinese character is converted to a combination of fourword boundary based character representations. In the tag inference layer, this deep neu-ral network model uses a second-order tag transition matrix to enhance tag constraints.Then, a deep neural network based joint learning approach is used for MSLPs to increaseinformation exchange among multiple SSLPs by sharing their character representationlayer. Experimental results on Chinese word segmentation and part-of-speech tagging&named entity recognition show that the deep neural network model with word boundarybased character representation outperforms the model with baseline character representa-tion, and the deep neural network based joint learning approach further improves singlemodels.
Keywords/Search Tags:Sequence Labeling Problem, Joint Learning, Reranking Approach, IterativeApproach, Deep Neural Network
PDF Full Text Request
Related items