Font Size: a A A

Research And Implementation Of Syntactic Pattern Recognition Approach For Chinese Relation Extraction

Posted on:2018-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:B HaoFull Text:PDF
GTID:2348330512489040Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data and artificial intelligence,how to intelligently and efficiently manage and use of the rapid growth of massive data becomes the major challenges in modern times.Information extraction technology extracts knowledge from vast amounts of unstructured text.The relation extraction technology can be used to establish the relationship between the discrete knowledge,which provides the basis for the organizing of knowledge units to interconnected network structure,for the upper applications such as semantic retrieval,question answering system to provide data support.Therefore the relation extraction technology has important research value and wide application prospect.Different from the traditional relation extraction,which needs to predefine relation types comprehensively,open domain relation extraction technology often expresses the relation between arguments by relation words.In this thesis,we studied the extraction method of any type of relation between arguments from open domain text by using the syntactic patterns of arguments,relation words and high quality tuples.We proposed a syntactic pattern recognition based approach for chinese relation extraction(SPRE),which could be divided into three stages: the construction of syntactic patterns set,the training of the filtering model and the relation extraction.In the first stage,according to the redundancy of large scale corpus,we used a small number of seed tuples to find the syntactic patterns in large scale corpus and contructed a syntactic patterns set.The second and the third stage had the same relation tuples extraction method,the only difference was that filtering model obtained from the second stage would be used to get high quality tuples in the third stage.The relation tuples extraction method consisted of four steps,namely,preprocessing,extracting noun phrease,extracting candidate relation words and recognizing the syntactic patterns of relation tuples.The main work of this thesis includes the following three aspects.Firstly,according to the characteristics of Chinese language,this thesis proposed three algorithms for extracting the light verb constructions(LVC),the LVC related prepositions,and the relation words of the special sentence patterns.In the process of the relation words extraction,the relation words were optimized by the light verb tables and special relation words extraction rules,which made relation words more complete and accurate.Secondly,based on the analysis of the syntactic patterns set,we extracted the basic noun phreases by using the noun phreases extraction rules and then by using the noun phreases optimization algorithm to deal with the problems of arguments incomplete and logic error.Thirdly,by using the high quality relation tuples extracted by our algorithm as the seed relation tuples,the syntactic patterns set could be automatically extended.The experimental results show that SPRE could improve the accuracy and completeness of both relation words and arguments in relation tuples,resulting in the precison and recall of extraction results improved.
Keywords/Search Tags:relation extraction, syntactic pattern recognizing, relation words, open domain text
PDF Full Text Request
Related items