Font Size: a A A

Research On Semantic Processing Technology Based Information Retrieval Model

Posted on:2010-10-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:R Q WangFull Text:PDF
GTID:1118360302958558Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
We are in an information age that mainly characterized by information explosion, and information retrieval techniques are now challenged a lot by more frequent Internet information updating, as well as increasing user demand for more precise search results. Semantic search technique, fortunately, is a hopeful way that leads to the key to the issue of finding exact information from mass number of them effectively. However, as a result of the incomplete realization of semantic web technique, recent study has been more focused on semantic retrieval technique in transition period, making it a hot topic of research.Several key problems in Information Retrieval (IR) domain are addressed and a novel Semantic Processing Technology based Information Retrieval (SPTIR) model is proposed in this dissertation. SPTIR is an extension on Query Expansion (QE) and Search Result re-Ranking, which consists of four parts, namely semantic query expansion based on Word Sense Disambiguation (WSD), query optimization based on word semantic relatedness, search results re-ranking based on document semantic relevance, and semantic enhanced personalized information recommendation.Firstly, in the context of keyword-based search engine, a well-structured and good-meaningful user query not only expresses user's personal needs precisely, but also guarantees the QS (Quality of Service) requirement for information retrieval. Starting with the issue of semantic associations of query keywords, supplemented by implicit feedback technique, and using unsupervised Word Sense Disambiguation, this dissertation presents a technique that maps query keywords to ontology concepts, and a semantic query expansion technique based on concept-word association. The WSD based semantic query expansion solves the problem of not well understanding user's query intension in traditional retrieval systems.Secondly, for those query keywords that fail to disambiguate, this dissertation presents a strategy that directly selects candidate expanded query keywords from the relevant documents using implicit feedback technique. In order to further condense and optimize the expansion keywords that generates from feedback, and to avoid the "topic shift" phenomenon in query expansion, this dissertation uses a semantic relatedness measurement between terms to filter expanded keywords to optimize the query.Thirdly, traditional keyword-based search always returns millions of search results, thus the relevance evaluation of retrieval results has become a hot topic of research. Based on the specific situation (success, failure) of Query Disambiguation, two distinct types of Document Semantic Relevance Measure, namely Semantic Vector Space Model based Document Relevance and Word Vector Space Model based Document Relevance, are proposed in this dissertation. With Semantic Relevance, the search results are re-ranked and the documents with a strong semantic correlation to query words are presented to user with high priority.Fourthly, the problem of how to meet the information needs of different users is studied, and a semantic-enhanced personalized information recommendation model is proposed. This model utilizes the semantic data sources and historical rating data to implement a hybrid recommendation. The introduction of semantic data sources solves the sparse problem and the cold start problem in traditional collaborative filtering system. In addition, in order to improve the system scalability and realize real-time recommendation, data mining method of fuzzy clustering is used to cluster the users and items in offline data pre-processing stage.
Keywords/Search Tags:Information Retrieval, Semantic Association, Implicit feedback, Word Sense Disambiguation, Query Expansion, Semantic Relatedness, Query Optimization, Clustering, Personalized Recommendation
PDF Full Text Request
Related items