Font Size: a A A

Entity Relation Extraction Technology Research On Network Text Information

Posted on:2018-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2348330563951359Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The vigorous development of the Internet promoting the wave of big data era,when the network brings us convenience,also with a lot of problems.The information explosion is an important issue in the era of big data,how to extract valuable information from the massive network data has become the primary task.Under this background,the technology of entity information extraction emerge as the times require.Entity relation extraction is to extract semantic relations between entities from the massive network data in the text,the traditional method of template based on knowledge base has been unable to adapt to the need of mass data,and although the rapid development of Natural Language Processing technology and machine learning for relation extraction has brought some new ideas,but there are still three aspect problems:(1)in named entity disambiguation,lacking analysis of cooperative entities' characteristics in target paper,the context characteristics information analysis is still stay in the disambiguation entity semantic features;(2)the main labeling heuristically data in distant supervision process easy to introduce the noise,except consideration of the knowledge incompleteness characteristics;(3)relation extraction algorithm based on neural network extract feature side is too single,the lack of multiaspect feature extraction of text semantic.Aimed at the above problems,this paper conducts research from the perspectives of entity disambiguation based context information,labeling heuristically relation data and relation characteristics extraction,to further improve the accuracy and practicability of the named entity relation extraction,the main research and innovation points are as follows:(1)Aimed at the problem of the insufficient analysis of cooperative entity relationship in the entity disambiguation text,this paper proposes an entity disambiguation method based on biterm topic model.The proposed method considers that the entity has a different theme in a certain semantic environment and the other entity appearing in the same document at the same time can help the disambiguated entity to determine the referred content to a certain extent.Therefore,using the ideas of named entity constructing double words to incorporate collaborative entity relationship to the topic model.Finally,some relevant experiments conducted on the web text data show that the proposed method can effectively improve the precision of entity disambiguation.(2)Aimed that it's easy for labeling corpus heuristically to introduce noise in distant supervision method,this paper proposes an distant supervised relation extraction method under type constraints.Firstly,we use the mutual information to further expand the knowledge base,and then heuristically learn the entity relation labels through distant supervision relation extraction method under the condition of type constraints.Hence,it reduces the ratio of false instances effectively.Through comparative analysis on open datasets of the correlation algorithm,it shows that the proposed algorithm obtains better precision in labeling corpus heuristically.(3)Aimed at the problem that the traditional single neural network method is limited in relation to the feature of extraction,a neural network model based on the feature of deep-fusion for convolutional neural networks is proposed.The use of word vector and shortest dependency path of two kinds of different representations as different input models based on convolutional neural network,automatic learning more dimension text features and deep fusion is then carried out to improve the accuracy of relation extraction.Experiments conducted on the SemEval-2010 Task 8 data sets show that this model can effectively combine the high dimensional feature of traditional text feature and neural network learning,and improve the relation extraction effect.
Keywords/Search Tags:Entity disambiguation, topic model, distant supervision, relation extraction, multi-instance learning, convolution neural network, deep learning
PDF Full Text Request
Related items