Font Size: a A A

Reasearch On Unsupervised Chinese Entity Relation Extraction Method

Posted on:2016-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Q ShiFull Text:PDF
GTID:2308330461493586Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Entity relation extraction is an important issue in the research field of information extraction. It is widely used in many fields, especially in recent years with the great development of the Internet, the traditional search engines have been gradually unable to meet the increasing demand of users. The knowledge graph technology, it provides a new way for search engine. The construction of knowledge graph is established on the basis of entity recognition and entity relation extraction, entity recognition has been mature, so the research of entity relation extraction is becoming more and more important.Entity relation extraction is based on the traditional rules or supervised machine learning. Although the two methods have high accuracy, but because of the two methods all need to have a lot of manual intervention, and the field of general poor, so it is not suitable for large scale application. In the recent years, semi supervised or unsupervised relation extraction has become a hot research topic. Semi supervised and unsupervised study abroad is carried out earlier, put forward many good methods. In contrast, because of complexity of Chinese, and the difference between the Chinese grammar and syntax of the English, does not have the significance of many research results abroad. Although in recent years, many scholars have proposed a variety of Chinese entity relation extraction method, but because of the network language updates faster, there have been new language phenomenon and the grammar of the language, the network is free, so the feature acquisition and precision are not allowed at the end of the problem still exists.This paper presents a large-scale corpus, entity extraction of unsupervised method. This method is also a common assumption of text extraction based on feature vector, which have the same entity relationship entity with context or similar. Based on this assumption, the entity relationship extraction, becomes a calculation of the context similarity computing entities, and then based on the similarity of the entity clustering features, extract the keywords describing the entity relationship. The main work of this paper is reflected in three aspects: first, on the basis of the classical context window, through the data statistics and analysis, put forward a kind of improved elastic context window method to obtain the feature words Secondly, this paper introduces the method of mutual information to calculate feature weight and improve the lack of mutual information method. Finally, put forward a method to improve the K value and the initial cluster centers selection and outlier processing of classical K-means with pre clustering and the standard scores.In order to verify the effectiveness of the method. In the network to obtain data, for several different schemes were distinguished experiment. From the results we can see, several methods proposed in this paper are the improvement of the effect.
Keywords/Search Tags:knowledge graph, relation extraction, syntactic features, K‐means, Mutual Information
PDF Full Text Request
Related items