The Research Of Binary Personal Relation Extraction On Web2.0

Posted on:2017-03-27

Degree:Master

Type:Thesis

Country:China

Candidate:L Xu

Full Text:PDF

GTID:2308330509950226

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of computers, more and more information appears on the Internet. But it has become a difficult problem in the computer field to search our useful information automatically. In order to resolve this problem, information extraction technology emerges. Because of the broad application prospects, many researchers pay more attention to the extraction of Personal Entity Relation which drawn as an important branch of information extraction. For traditional process of relation extraction, it has many problems like that it has many words to describe the same relationship, extraction template quality is not high, and there are a large amount of calculation to analysis Personal Entity Relation. To deal with these problems, this paper presents a new method of Web2.0 which combined semi-supervised learning features of machine learning and the Information Gain features of information theory for extracting relationship between two people according to previous research results on entity relation extraction.In response to these problems, this paper proposes the following improvement proposals:Firstly, for Chinese statements "multi-word synonymous" phenomenon, this paper presents an extension method of description of relation which based on Crowd-sourcing. Given portion of the particular description of relation artificially, use "How Net" and "word synonym forest" to expand them firstly, and then distributed the collection of the expanded to the public network, so that fans of the language made the second extension, finally, made the similarity calculation and filter part of synonyms into a repository for us to analysis.Secondly, this paper put forward an algorithm of relation extraction template combined with semi-supervised learning and Information Gain. In this paper, in order to resolve the shortage of time-consuming template created artificially, we combined with a semi-supervised learning into the process of template been created. First of all, set up part of manual sample labels, loop iteration continuously in relation extraction process to produce more relation extraction templates. For the feature of each word in the sentence which carries a different amount of information due to the different location, this paper uses the value of Information Gain to determine the window value of template in the context.Thirdly, for the phenomenon of one sentence containing a plurality of personal entities, this paper proposes a screening method based on template matching. This method judging relative position between a pair of entities in the template and description of relation,and then screening entities as candidate entities which containing the relative position information in the sentence.Finally, for the invalid calculation of 0*0=0 in the text similarity calculation of Vector Space Model, this paper put forward a verification method of candidate entities which bases on the non-zero weighting optimization. This method can optimize the dimension of feature weight matrix, and we can do the non-zero weighting judgment before the similarity calculation thereby reducing the amount of calculation.

Keywords/Search Tags:

Personal Entity, Relation Extraction, Information Gain, Machine Learning

PDF Full Text Request

Related items

1	Research On Chinese Personal Relation Extraction Based On BiGRU
2	N-ary Chinese Open Entity Relation Extraction
3	Research On Personal Relation Extraction Method For Social Network Application
4	Research Of Entity And Relation Extraction Based On Text
5	Research And System Implementation Of Entity Relation Extraction Algorithm Based On Text Generation
6	Research On Extraction Technology Of Relation Between Enterprise Entities Based On Machine Learning
7	Research Of Chinese Personal Social Relation Extraction Based On News Data
8	Relation Extraction Of Chinese Named Entities Based On Location And Semantic Features
9	Research On Entity And Relation Extraction Technology Based On Deep Learning
10	Research On Entity Relation Extraction Algorithm Based On Semi-supervised Machine Learning