Semi-supervised Structured Learning For Pos-tag Projection Across Languages

Posted on:2013-08-12

Degree:Master

Type:Thesis

Country:China

Candidate:P L Hu

Full Text:PDF

GTID:2268330392467974

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Natural language processing(NLP) has achieved great success in thisinformation ageï¼Œand people canâ€™t live without natural language processing. Dueto the need of culture exchange, there are requirements for the minoritylanguages, which are lack of labeled corpus. Thus it limits development of theNLP technique in these languages. People try to use cross-lingual projectionmethods which utilize the resource-rich languages to help the learning inresource-poor languages.In this paper, we resort to several semi-supervised structured learningalgorithms, which make use of the word alignment to help pos-tag projection. Wedefine the cross-lingual projection problem as semi-supervised structuredlearning problem. All the proposed methods are incorporated into this framework.Then we propose the direct projection algorithm, project the pos-tags of thesource language directly to the target language via word alignment. Then weconsider the algorithms in the absence of target language labeled data and fewamount of labeled data. At the same time, we study the word alignment filteringmethods. We use two word alignment filter methods, the cross-lingual projectionaccuracy is improved. We also use the co-training framework to solve the cross-lingual projection problem, extend the co-training method to the structurallearning, and research on the confidence metric in the sequence labeling modeland the influence of the different types of alignments. The experiments show thatusing one to one word alignment and the training data update strategy based onthe pieces can get better result. Finally, we use the label propagation algorithm toreduce the noise introduced by the direct projection. The similarity graph is builtusing the context feature of a word. In this process, we use singular valuedecomposition technique for feature reduction, in order to reduce impact bring bythe sparse feature problem. Then we use the distribution of the pos-tags inferredby label propagation to constrain the Markov Random Fields. The experimentshows the co-training and label propagation algorithm succeed in pos-tagprojection task, which is better than the using direct projection and supervisedmethods with small amounts of labeled data.

Keywords/Search Tags:

pos-tagging, semi-supervised learning, cross-lingual projection, co-training, label propagation

PDF Full Text Request

Related items

1	Research On The Application Of Semi-supervised Learning In Natural Language Processing
2	Research And Application Of Image Classification Algorithm Based On Semi-supervised Learning
3	Research On Label Propagation Of Semi-superyised Based On Clustering
4	Learning From Limited And Imperfect Tagging
5	The Study Of Robust Semi-Supervised Classification Algorithm Based On Label Prediction And Propagation
6	Research On Semi-supervised Cross-modal Hashing Retrieval Algorithm
7	The Study Of Robust Label Propagation Algorithms For Semi-Supervised Data Classification
8	Research On Partially Labeled Problem Based On Active Learning And Semi-supervised Mechanism
9	Research On Semi-supervised Partial Label Learning Algorithm Based On Label Propagation
10	Research On Semi-supervised Label Distribution Learning And Label Enhancement Algorith