Font Size: a A A

Transition-based Dependency Parser Combining With Self-training

Posted on:2016-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:D W XiaFull Text:PDF
GTID:2298330467980968Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Transition-based dependency parsing is a kind of data-driven dependency parsing technologythat take advantage of standard supervised machine learning approaches, which makestransition-based dependency parser depend on labeled data, that is to say, when parser lacks oflabeled data or training data’s domain can’t match the domain of test data, its performancedegrades. To solve this problem, we combined self-training, a semi-supervised learningmethod, with traditional transition-based parser, which can fully use unlabeled data and canhave stronger domain adaptation ability when training data domain mismatch test datadomain.At first, we found that classical transitional-based dependency parser is greedy, which makeserrors of parses have some common characteristics. According to the common characteristics,we defined two kinds of root-biased sub-tree and proposed a new method that parse asentence after pre-parse and reduce sub-trees. Our experiments showed that this approach candiminish errors of root-biased sub-trees and makes no influence to parsing other part of asentence, which improve the performance of parser.Then we combined self-training into transition-based dependency parser. The parser utilizesoriginal training data to parse unlabeled data first. Then, it uses self-confidence-based dataselection method and heterogeneity-based re-ranker data selection method to choose highquality and heterogeneous sentences that will be added into original training data set. We callthis data set self-training data set. We train a new model on it and parse test data set on themodel. Experiments showed that our method has stronger domain adaptation ability and hasbetter performance when we lack training data.Finally, we designed and built a transition-based parser merged self-training method. Theparser not only support standard supervised approach to train model and parse sentences, butalso support semi-supervised machine learning method, self-training.
Keywords/Search Tags:Self-Training, Transition-Based Dependency Parsing, Certainty, Dependency Parsing, Semi-Supervised Learning
PDF Full Text Request
Related items