Font Size: a A A

The Semantic Information Retrieval Research Based On Multilayer Vector Space Model

Posted on:2012-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:B BaiFull Text:PDF
GTID:2218330338497541Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Information plays a more and more important role in modern society. Reliable and accurate information can improve efficiency, reduce working hours, strengthen competitiveness, and make a scientific decision. As one important method of capturing user-demand information, information retrieval is gradually concerned by more and more people. However, the traditional information retrieval model is inefficient. There are a lot of defects, such as the false retrieval and omitting retrieval. The search results can not meet people's intention. Because of the defects in information retrieval, some new researches and improvements about information retrieval are proposed in this paper.First of all, the principle of information retrieval is introduced in this paper. The traditional boolean model, vector space model, probability model are studied and compared. Vector space retrieval model is quite comprehensive in retrieval models. And it is also the most widely used traditional retrieval model. The vector space model is improved in this paper. The keywords in documents and query statements are regarded as independent and no semantic relations. At the same time, every segment has its contribution to the document theme. The keywords in different parts of document must be haven different treatments.The retrieval is made semantic extension by using domain ontology. Ontology contains the hierarchical structure of concept, and supports the inference of conceptual relationship. It makes that the computer can understand human mind. It is also convenient to conmmunicate people with computers. In this paper, the concept and related theory about ontology are introduced. After analyzing the role of ontology in information retrieval, the information retrieval based on semantic extension is proposed. The Information retrieval based on semantic extension belongs to semantic information retrieval. And it's based on domain ontology. The disadvantage of traditional vector space model is caused by the simple and mechanical match on the level of grammar. The Information retrieval based on semantic extension can rise to the semantic level. After semantic reasoning, the semantic information and relationship about keywords are obtained. The shortcomings of the traditional vector space retrieval are made up in the semantic aspect. In the document structure, the idea of multilayer VSM is used. According to the different importance of various parts about the document, the document is divided into different segments. Then the similarity is appropriate adjusted. The model can reflect the characteristics and attributes of the document. And the model can also get the right relationship between the docunment and retrieval statements. The information retrieval based on multilayer vector space model is combined with the semantic information retrieval. Because of so many deficiencies in the traditional VSM, a new semantic information retrieval based on multilayer vector space model and its algorithm are proposed in this paper. It combines the semantic information retrieval based on domain on ontology with the multilayer vector space model very well. This paper puts forward some related theories and methods.Besides, semantic information retrieval based on multilayer VSM uses the tf - idf fomula as the keyword-value fomula. It makes the keyword-value caculation more comprehensive. In practical retrieval process, the new model is no longer confined in the fixed format. However, we use the suitable mathod for the segment. The experiments finally demonstrate that the new model and method have a better effectiveness and feasibility in the retrieval.
Keywords/Search Tags:Information retrieval, vector space model, similarity, domain ontology
PDF Full Text Request
Related items