Font Size: a A A

Research And Application Of Semantic Technology In Web Topic Information Retrieval

Posted on:2013-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:J F XieFull Text:PDF
GTID:2248330395968473Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the continuing development of Internet, Web has become the largestinformation carrier, where people obtain information mainly from. Currently, as one ofthe most important means of obtaining information, information retrieval technologydeveloped rapidly. But with growing web information resources, the traditional searchengine can’t satisfy users’ needs well, or make effective utilization of the information.The major reasons are: on one hand, mass information resource base make quantityreturns with low precision, which means much time and energy wasted on second oreven multiple information filter; On the other hand, isomerism of Web information anddifferent intellect level between users and machines lead to meaning isolation, whichincreases the miss-matching probability of keyword based retrieval system and reduceits efficient recall ratio greatly.As the core technology of Semantic Web, ontology plays a crucial role in semanticrepresentation. It makes human-machine communication possible, improves the level ofmachine intelligence, and brings a new opportunity for development of Web. Ontologybased topic retrieval system could greatly promotes the effective utilization of resources,and meet users’ needs farthest. Considering these, this paper designs and builds anontology based topic retrieval system, to improve effective utilization of resources andefficiency of the retrieval system. The major research content and innovation are shownas follows:①Construction of Chinese Computer Technology Domain Ontology. Based onapplication requirements and current theories&techniques, this paper makes someimprovements in traditional ontology modeling method: On one hand, comprehendingkeywords in academic documents, thesaurus and related concepts adopted by popularknowledge bases could ensure the quality of concepts; On the other hand, taking fullyconsideration of application requirements before ontology design could avoid theobstruction of minutia to some extent. Finally, this paper designs andsemi-automatically builds a Chinese computer technology domain ontology, then carriesout related query experiments to test the ontology and prepare for further informationretrieve.②Implementation of ontology based query extension. As keyword-based search engines fail in semantic isolation, we resort to meaning level matching, which called asemantic search. Its core module is semantic extension, which could recall the relatedinformation with diverse expressed keywords. Here, whether extend a concept is mainlydepending on the correlation degree between concepts, which can be determined by twomajor factors: one is the defined intrinsic relationship (including public property, etc.) ofthem, we call it internal factor and short it for "Relation"; the other is their distance inontology tree, calling it external factor and "Similarity" for short. In view of applicationrequirements and current researches on correlation calculating, paper makes someimprovements to the correlation algorithms, designs and implements a new suitableexpansion algorithm for the prototype system.③Design and implementation of ontology-based topic retrieval system. Theretrieval system is composed of two parts: Web information retrieval subsystem(OntCT-SE) and ontology query subsystem (OntSearch). OntSearch is designed tofacilitate users to know fully about concepts and knowledge structure of the specifiedfield, which is described in a tree form. By the help of this system, users could easilyand quickly query the domain ontology, including ontology classes, attributes andtriples, etc.. OntCT-SE is an ontology based topic retrieval system, which is designed onthe basis of constructed domain ontology, successful implementation of improvedcorrelation algorithms and query extension module. It is designed in the hope ofpromoting the efficiency of traditional retrieval system to a certain extent.In order to verify the effectiveness of the improved algorithm and the efficiency ofprototype retrieval system, this paper finally conducts comparative experiments. Resultsindicate that this ontology-based topic retrieval system acts efficiently on queryextension and topic information search. To a certain extent, it improves the recallprecision of keyword based retrieval system.
Keywords/Search Tags:semantic web, ontology, topic information retrieval, semanticsimilarity, query extension
PDF Full Text Request
Related items