People Relation Extraction Method Based On Feature Vector

Posted on:2016-07-14

Degree:Master

Type:Thesis

Country:China

Candidate:S S Fan

Full Text:PDF

GTID:2308330452468987

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development and wide application of the Internet, the network contains alarge variety of information, such as relationships between the characters and places entity,characters and character entity. However, this information has not been effectively utilized.How to dig out the relationship between the characters from the network is a matter ofgrowing concern. At the moment, entity extraction technology which based on the featurevector is relatively mature, and it is one of the most commonly used method.Extraction method which based on the the feature vector to convert the entityrelationship extraction into classification problem, because SVM (Support Vector Machine,SVM) classification accuracy is very high, so people generally put it in combination with themethod based on feature vectors. According to the defects of the general methods for relationextraction，The main work of this paper is as follows:1. Generic multi-classification SVM methods exists unclassifiable regions.if use it tocharacters relation extracIion will make some relationships are not classified, thus affectingthe results of the characters relation extraction. In response to this phenomenon, DAG-SVMmulti-classification method is introduced to solve the problems. Since DAG-SVM exist "erroraccumulation" defects, in shis paper, the characters relations are divided into two types ofkinship relations and other social relations, and these categories as root to alleviate the"cumulative error" phenomenon. By using general multi-classification method, FMSVMmulti-classification and DAG-SVM multi-classification method for the comparison. Theresults show that the proposed method for extracting character relationships accuracy hasimproved to some extent.2. In the people relation extraction, the spatial dimension of feature is often very high.resulting in sparse vector problem, which will affect the relationship extraction efficiency. Inresponse to this phenomenon, the first,character relationships are divided into six categories,and then Introduced document frequency, information gain, mutual information and χ2statistics of these four feature selection algorithm to educe the dimension of the feature space.Finally, the use of SVM classifier to extract the people entity relationship. Experimentalresults show that the four feature selection algorithm not only can guarantee extractionperformance, but also effectively reduce the vector space dimension drops and dramaticallyimprove the relation extraction efficiency. Which, χ2statistical algorithm works best,followed by information gain.

Keywords/Search Tags:

Relation extraction, Support Vector Machine, Feature Selection, Multi-classification, DAG-SVM, FMSVM

PDF Full Text Request

Related items

1	Research On Feature Selection And Multi-Class Classification Methods Based On Twin Support Vector Machine
2	Research On Application Of Support Vector Machine In Liver B Ultrasonic Images Classification
3	Study On Trademark Image Classification Based On Support Vector Machine
4	Study On Classification Methods Of Multi-class Mental Tasks Based On Support Vector Machine
5	Researches On Some Problems In Nonparallel Hyperplanes Support Vector Machine And Feature Extraction
6	Study On Least Squares Support Vector Machine And Its Applications
7	Research On Feature Extraction, Selection And Classification Algorithms For Pulmonary CAD
8	Research On Image Classification Based On Support Vector Machine
9	Improved Support Vector Machine And Its Application
10	Feature Selection And Classification For Imbalanced Medical Data