Font Size: a A A

Research On System Of Multi-field Information Extraction Based On Semantic Role And Concept Graphs

Posted on:2011-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:X X YangFull Text:PDF
GTID:2178360305459301Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Information extraction is a hot research in the field of natural language processing. In abroad, the research work began in the last century and has achieved many results. About the research of Chinese information extraction is still at the initial stage. Other existing methods of extraction are basically based on statistics. This method is lack of semantic information to support. Using statistic method to extract information is low efficient and imprecise, because the approach ignores the semantic links among the words in sentence. Besides, this A simple probability model can not understand the semantics about the sentence at all, so the extraction results is in low quality, and could not satisfy the intelligent demand. In order to solve the deficiency of traditional extraction method, this article that based on the previous studies presents a new approach using semantic role and the semantic resources in the "Hownet" to extract information. This new model is a multi-field information extraction system which based on semantic information.In this paper, our research model is based on semantic roles and the conceptual graph. The relevant processes are as follows:First, labeling the semantic role of sentence. Later, going to the module of pretreatment and filtering out empty word and interjection. Secondly, according to the semantic information, using relevant Algorithm which paper presented to generate conceptual graphs. Thirdly, the relevant module calculates the similarities of conceptual graphs so as to identify areas of the scene. We have adopted a automatic classification method to generate extraction templates, this main idea come from prototype of Bootstrapping. At last, we should construct the rules of extraction. In this module, we take advantage of semantic role to generate the rules in order to improve the accuracy of extraction. Among them, the field division of the scene, template generation and extraction rules are the main content of this article study.In the final part of paper, we give the evaluation results of system. The results contain two aspects:vertical comparison horizontal comparison. In the section of vertical comparison, we extract the information of same domain but in different technology. In the horizontal comparison, we did the two experiments. One is the scene division, another is the cross-cutting information extraction. We use three different approaches to extract the Multidisciplinary message. Experimental results show that our method of extraction is feasible and effective. Besides, this approach improves the accuracy of information extraction system and the recall rate.
Keywords/Search Tags:information extraction, semantic roles, similarity calculating of conceptual graphs, Semantic Computation, HowNet
PDF Full Text Request
Related items