Font Size: a A A

Domain Ontology And Position Relationship Based Information Retrieval Model

Posted on:2015-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:S P SuiFull Text:PDF
GTID:2298330452453542Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The fast increment of internet information has greatly changed the way peopleaccess information, so in the face of a large number of internet information, how toquickly and easily get effective information, gradually become the focus of theproblem. The appearance of the search engine has greatly eased the contradiction.Search engine is a kind of software system applied on the internet, and use a certainstrategies to collect and discover information, and form the retrieval library afteranalyze, extract and organize the information. Information retrieval is themathematical basis of search engine, and it organizes and stores the information in acertain way, and finds relevant information according to the needs of users.Ontology is a explicit formalized specification of the sharing concept model. It isa special type of term set, and have the structured characteristics, and is more suitablefor use in the computer system. Domain ontology model the object in a particularfield, or a part of the real world. In all kinds of information retrieval models, the mostcommonly used models is vector space model. But vector space model has its inherentshortcomings, so many people have make an improvement. Although the improvedretrieval model has achieved certain results, the effect is still not very clear. Whencalculating the relevancy of query and the document, although they consider thedomain ontology or Wordnet dictionary, but do not combine them. In addition,because the existing retrieval model does not consider the location features of queryterm which is also an important factor, the query lost the sequence relationship andthe neighboring relationship after being processed by retrieval model.The innovations of this research are as follows:First, we collect various concepts in the field of software as well as therelationship between the concepts, and use terminology to express them, which servesas the basic concept of ontology. According to the structure of the semantic dictionary,This paper organize them into semantic network. We use Protégé to manuallystructure software domain ontology, as a reference for concept similarity.Second, we propose a new similarity algorithm by merging information theorybased similarity with software domain ontology based similarity. Then, we consider the location relationship of query terms as a factor ofcalculating the relevance. We propose the concepts of word order based relevance andword adjacent relationship based relevance, and formalize them, then make anpreliminary implementation.Finally, we construct an information retrieval system on the basis of softwaredomain ontology and the concept been proposed, then implement it. The results showthat this model have a better presition than VSM model, and the similarity calculationbeen proposed is relatively close to experience value.
Keywords/Search Tags:Information Retrieval, Domain Ontology, Similarity, Search Engine, Vector Space Model
PDF Full Text Request
Related items