Font Size: a A A

Relationship Extraction Technology Based On Co-Training And Kernel Method

Posted on:2016-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y C ZhangFull Text:PDF
GTID:2298330467992101Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Relation extraction technology can extract entities that have a certain relationship from natural language text, and this enables people to find information easily and quickly. At the same time, this technology helps people extract entity pairs in the text automatically. By re-build a new data structure, it enhances people’s ability to analyze information.Currently, the relation extraction technology has many areas for improvement. This paper investigates the current research, and enhances the understanding of the issues for relation extraction and researches some related fields. Then, this article proposes some ideas to improved semi-supervised relation extraction technology and do test to verify them. The main results of this work and research include:Firstly, this paper design and implement an improved algorithm based on the co-training. For semantic drift problem, this paper presents a formula to evaluate score of entities and templates. This paper filters out lower score entities and templates to ensure more effective iterative algorithm. This improvement enables iterative algorithm can improve more times and makes the F1value increased by0.09.Secondly, this paper proposes a new method based on word embedding to improves co-training algorithm. For researchers, full use of the characteristics of information is one of the priorities in recent years and word embedding has a very strong advantage in this field. This article adds word embedding and other linguistic information into the template to enrich the expressive power of templates. The use of deep learning technology makes co-training algorithm performance further improved. Experiments show that F1value is increased by0.10.Thirdly, this paper proposes an improved co-training algorithm based on kernel function. According to the characteristics of supervised and semi-supervised algorithm, we organize these two algorithms. Firstly, get a lot of templates by co-training algorithm, and then use SVM algorithm to train these templates. Finally, this paper conducts a cascade of these two algorithms. Through effective use of the kernel function, experiments show that the value of Fl raises0.05.Finally, we set up a research system in the TAC meeting tasks, and won first prize.In relation extraction field, this paper makes a lot of experiments, and put forward some new ideas, and have meaningful value for the further development of relation extraction.
Keywords/Search Tags:relationship extraction, co-training, kernel method, semantic drift, word embedding, semi-supervised
PDF Full Text Request
Related items