Font Size: a A A

Research On Emergency Intelligent Information Retrieval System Based On Domain Knowledge Model

Posted on:2014-11-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H YangFull Text:PDF
GTID:1268330401963123Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, emergencies occur frequently and have drawn social attention widely. Information about emergencies on the Internet is rapidly increasing. People’s demands for emergency information retrieval have become higher and higher. Applying ontology into information retrieval system provides a guarantee for improving the retrieval performance in organizing form and semantics on one hand, and provides supports for reasoning according to the logic relations and reasoning rules between concepts on the other hand. Research on intelligent information retrieval system based on ontology knowledge model has great theoretical significance and application value. This dissertation aims at studying some key theories and technologies about the intelligent information retrieval system based on ontology knowledge model for emergency domain, including how to organize and represent emergency knowledge, how to acquire domain concepts and relationships between concepts automatically for ontology expansion, and how to understand and process user’s queries from semantics to implement semantic retrieval, etc. The main contributions and innovations of this dissertation are as follows:(1) Currently few studies about emergency domain knowledge model have been carried out and knowledge organization and representation methods for emergency domain have not been proposed so far. For this problem, the emergency domain knowledge is analyzed and the concepts and relationships between concepts are presented. On that basis the emergency ontology model is constructed and an OWL-based emergency domain knowledge representation method is proposed, which are used for emergency knowledge organization and representation respectively so that emergency knowledge can be shared. The emergency ontology is evaluated quantitatively with OntoQA method. The evaluation results demonstrate that this emergency ontology can express more emergency knowledge and it includes more instances.(2) Existing automatic extraction methods of concepts can not acquire Chinese compound domain concepts and do not take semantic factor into account. In this dissertation, a Bootstrapping-based domain concepts extraction algorithm (BCAE algorithm) is proposed. The compound words determined conditions are presented based on mutual information and information entropy. The candidate concepts determined conditions are proposed considering the co-occurrence sentences frequency and support. According to the candidate concepts determined conditions, the compound words with lower occurrence frequency can be avoided to be filtered out. Meanwhile the semantic factor is introduced. By calculating the semantic similarity with the important concepts based on probability distribution of context information, the semantically similar domain concepts with lower frequency can be also acquired. The comparative experiment results demonstrate that the concepts extraction recall and precision of the BCAE algorithm are17%and20%higher at most than the domain concepts extraction algorithm based on domain correlation and consistency degree (FCRC algorithm) and11%and17%higher at most than the Bootstrapping-based automatic acquisition algorithm of domain words (FWB algorithm).(3) Existing algorithms for extracting relationships among concepts can acquire few types of relationships or can not determine the types of relationships. In this dissertation a hybrid automatic relationships extraction algorithm (HRAE algorithm) is proposed. The relationships between domain concepts are classified into two kinds:one is the relationship of unknown types; the other is the relationship of the known types. For the relationships of unknown types, a method based on the association rules and different sentences patterns is proposed and used to implement the relationships extraction. It can avoid some verbs that are not between two concepts but represent the relationships between them to be omitted. For the relationships of the known types, the construction and expansion methods of the relationships extraction rules are presented and the relationships extraction rules are used to implement the relationships extraction. The results of the comparative experiment with the relationships extraction method based on association rules(ARRE algorithm), the relationships learning method (NTRL algorithm), and the relationships extraction method based on graph(GRAONTO algorithm) demonstrate that the HRAE algorithm can acquire the core domain relationships and obtain better performance-the precision-recall, F1-measure and F0.5-measure values are higher6%,6%,4%than the optimal values of the ARRE algorithm, NTRL algorithm and GRAONTO algorithm respectively.(4) The existing similarity measures do not consider the influence factors thoroughly and make the most of the semantic knowledge of ontology. For these problems, how the semantic distance, level factors, the overlapping degree between sets of the hypernyms and hyponyms impact the semantic similarity are analyzed in this dissertation. On this basis an ontology-based semantic similarity computation model(OSSC model) is established. The calculation of the overlapping degree between sets of the hypernyms and hyponyms makes use of the semantic relationships between concepts. Besides the association between the semantic distance and concept level is established and the number of parameters of the OSSC model used to adjust the contribution rate of each influence factor is reduced, so it does not need to take up more time to determine the appropriate parameter values. A contrast experiment is made between the established OSSC model and ten other similarity measures proposed by D.Sanchez, Petrakis, Rodriguez&Egenhofer, Leacock&Chodorow, Li, Wu&Palmer, Hist&St-Onge,Resnik,Lin, Jiang&Conrath recently. The correlation coefficient assessment method is used in this experiment. The greater the correlation coefficient is, the higher the accuracy of the similarity measure. The experiment results show that the average value of correlations between the similarities obtained with the OSSC model and the benchmark reaches0.85on two standard datasets Miller-Charles and Rubenstein-Goodenough. It is greater than the optimal value0.83from ten other similarity measures above. This demonstrates that the OSSC model is of high accuracy.(5) The intelligent information retrieval prototype system based on the emergency ontology (EIIRS) is implemented. In EIIRS the emergency text information collection is implemented by using the emergency topic crawler. An emergency ontology expansion framework is presented. This framework applies the proposed extraction algorithms for emergency concepts (BCAE algorithm) and relationships (HRAE algorithm) to expand the emergency ontology. Now it has included51classes,75properties and4234instances.33inference rules are designed according to the semantic relationships in the emergency ontology, and the emergency ontology reasoning is implemented by using Jean reasoning engine. In order to implement the semantic retrieval of emergency information, a semantic retrieval model based on emergency ontology (EOBSR model) is established. In EOBSR model, to avoid the homogeneity of expanded queries and topic drift of retrieval results to some extent, the semantic query expansion and sorting method is presented according to multiple semantic relationships provided by the emergency ontology and semantic similarity computation model. The results of semantic retrieval experiment for emergency demonstrate that, by applying the proposed semantic retrieval model, not only concepts that have special relationships with queries can be expanded, but also more relevant retrieval results can be placed in the top. The redefined precision of the semantic retrieval is33.9%higher on average than Lucene retrieval.
Keywords/Search Tags:emergency, ontology modeling, knowledge acqusition, semantic similarity computation, intelligent information retrieval system
PDF Full Text Request
Related items