Font Size: a A A

Research And Implementation Of Open Chinese Entity Relation Extraction

Posted on:2014-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:L L KangFull Text:PDF
GTID:2268330425491544Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and network technology, the Internet has become the largest information platform. The data of Internet expands sharply. Traditional search methods fail to meet user query and practical application. Information extraction strives to translate contents expressed by semi-structured or non-structure natural language into structured information stored in knowledge base to provide intelligent and humanization information retrieval for people. As a vital part of information extraction, entity relation extraction is of great importance to understand and master network information and provide fine-grained retrieval services for people.Mass of open field texts in the network contains a large number of entities and entity relationships of unknown types. Traditional methods of entity relation extraction which get predefined types of entity relationship from specific areas corpus faces serious challenges. In this paper, a method of open relation extraction which is independent of corpora manual labeled is studied. The method can obtain a large number of reliable entity relationships which are not limited to the categories from Chinese corpus of many fields. The method takes advantage of redundancy of network data to acquire entities and sentences automatically for relation extraction. According to the expression of Chinese, the method of extracting relations feature terms which identify entity relations is studied in the paper. Methods of relation feature terms extraction based on the physical location, Syntax-based analysis and based on Markov logic network probability model are proposed. Methods of relations features numerical are studied when relations feature terms collection is obtained. Agglomerative hierarchical clustering in which average connectivity is served as the measurement of cluster similarity are used to entities pairs cluster procedures. Entity pairs of different relations can be gathered into different clusters. Entity pairs in the same cluster are marked used the same relation label. Lastly, the reliability of clustering results is evaluated so as to enhance the quality of relationships obtained. Evaluation methods for open relation extraction are further researched and a probability model which can effectively evaluate the reliability of relation is proposed to guarantee extracting relations of higher credibility.The experiences show the method of open Chinese entity relation extraction studied in the paper performs well. A large number of value entity relationships can be acquired in the method. And the method can satisfy the requirement of user query and practical application.
Keywords/Search Tags:information extraction, entity relation extraction, open information extraction
PDF Full Text Request
Related items