Font Size: a A A

Research Of Full-text Retrieval System Based On Ontology

Posted on:2014-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2268330425966309Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, it has become the focus in the informationretrieval field that how to accurately and quickly retrieve valuable information from the massof information resources. The traditional full-text retrieval technology, which mainly relies onkeyword matching technology, can quickly complete the retrieval of massive amounts ofinformation. But the retrieval just literally completes the retrieval request and index matching,and it lacks the capability of understanding and analysis of semantic level of retrieval requests,thus the retrieval results may miss important information or contain a lot of irrelevantinformation.The ontology effectively organizes and describes the information resources, and theconcept words in ontology are connected by the relationships that combine with concepts tocomplete the logical reasoning of the ontology. In the paper, the ontology technology isintroduced into full-text retrieval system, which can achieve semantic support of retrievalrequest and greatly improve the retrieval accuracy of traditional full-text retrieval system andthe ability to filter useless information by making use of logical reasoning ability of theontology. The main work, which focuses on ontology-based semantic text retrieval system, isas follows:(1) On the basis of in-depth study of the ontology and its concept semantic similarity,the problems in concept semantic similarity computation are analyzed. An integratedweighted semantic similarity calculation method is presented,which is based on principalcomponent analysis(PCA). The traditional semantic distance-based algorithms andinformation content-based algorithms are both integrated into the method. Besides, the depth、density factor and semantic coincidence degree are advanced into the method to conduct acomprehensive analysis. In order to determine the right weights in the synthesis algorithm, theprincipal component analysis is proposed to improve the weight allocation method. Theexperimental results show that the proposed method can effectively improve the accuracy ofthe concept semantic similarity calculation.(2) A common ontology language file parsing application model is designed andimplemented by making use of Jena software package. The model can not only calculate thesemantic similarity between concepts, which is based on the relationships between the concept and the concept of the ontology, but also support the function of importing theconcept pair and their semantic similarity into a relational database.(3) The process、framework and core technology of full-text retrieval are all studied,besides the architecture of Lucene.Net full-text kit is deeply analyzed. A semanticontology-based full-text retrieval model is designed by using Lucene.Net full-text kit andontology related technology, and the detailed design of each module is given. In the originalsystem, the query module and the results feedback module lack of the semantic support,which leads to lower accuracy of retrieval results. To solve the problem, ontology conceptsemantic similarity is introduced to expand its function.(4) It designs and implements a full-text retrieval system based on ontology. Thesimulation result demonstrates that the system is superior to traditional text retrieval system inrecall rate and precision rate by using the specific query instance.
Keywords/Search Tags:full-text retrieval, Lucene.Net, ontology, concept semantic similarity, principalcomponent analysis
PDF Full Text Request
Related items