Font Size: a A A

Research On Cross-lingual Relation Extraction Between Named Entities

Posted on:2015-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y N HuFull Text:PDF
GTID:2268330428498559Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The quantity and quality of training corpora have an important impact on theperformance of machine learning-based semantic relation extraction between namedentities, however, the annotation of corpora is time-consuming and labor-intensive.Meanwhile, the emergence of multi-language corpora and the development of machinetranslation technology provide a good opportunity for the research on redundancy andcomplementariness between languages. This paper proposes three approaches forcross-lingual relation extraction, aiming at improving multilingual relation extractionperformance as well as reducing the amount of needed labeled corpora. The study includesthree aspects as follows:1) Cross-lingual relation extraction via machine translation. First, we obtain translatedcorpora via machine translation, then perform entity alignment, and finally add thesemapped translated instances into the training corpus of the target language to help itsrelation extraction.2) Bilingual co-training for relation classification. Given a small number of labeledinstances and a large number of unlabeled instances in two languages, translated instanceswhich correspond to reliably classified instances in one language are augmented to thetraining corpus in the other language in a bootstrapping fashion. Relation classification ontwo languages helps each other to enhance themselves simultaneously.3) Bilingual active learning for relation classification. Active learning paradigm isapplied to bilingual relation classification. When choosing the most informative unlabeledinstances, combined confidence is adopted to consider the confidence measures of oneinstance in two languages. The experiments of relation extraction on the ACE2005Chinese and English corporashow that both manually labeled and automatically classified relation instances in onelanguage can consistently and robustly help the relation extraction in the other language.This help is particularly significant when the scale of training corpus of the target languageis small.
Keywords/Search Tags:Relation Extraction between Named Entities, Machine Translation, EntityAlignment, Co-training, Active Learning
PDF Full Text Request
Related items