Font Size: a A A

Study On Key Techniques Of Contextual Information Retrieval

Posted on:2008-01-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:X TianFull Text:PDF
GTID:1118360242491185Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The explosive increase in information has made it difficult for the average user to easily satisfy his or her information needs. Information retrieval (IR) systems intend to relieve the users of these difficulties. With the development of IR techniques, contextual information retrieval (CIR) has been identified to be a promising direction for improving search. In CIR, the retrieval of information depends on the time and place of submitting query, history of interaction, task in hand, and many other factors that are not given explicitly but lie implicitly in the interaction and surroundings of searching, namely the context. The aim of CIR is defined to improve the efficiency with which the user's information need behind the query is satisfied, namely to provide the user with WYGIWYW (what you get is what you want). Furthermore, CIR is viewed to combine search technologies and knowledge about user context into single framework to realize context driven information supply in the future.In IR, users and the retrieved objects (documents in this paper) are the core elements. In CIR, how to semantically understand and utilize the contexts of users and the documents are two of the most fundamental problems. In this paper, we mainly focus on two problems: one is the semantically smoothing of document model in the framework of statistical language model; the other is the semantically understands of user's information need behind the query. Finally, as a result, we show a prototype system called CKRS (contextual knowledge retrieval system). The main contributions of this thesis are as follows.(1) Propose a Novel Method to Measure Term-Concept AssociationIn semantic-based query expansion, measuring term-concept association is a key step in finding associated concepts to describe query needs. We propose a novel method called K2CM (Keyword to Concept Method) to measure the term-concept association. In K2CM, we introduce the attaching relationships among terms, documents and concepts together with term-concept co-occurrence relationships to measure term-concept association. The attaching relationship derives from the fact that a term is attached to some concepts in annotated corpus, where a term is in some documents and the documents are labeled with some concepts. For term-concept co-occurrence relationship, we enhance it by the text distance and the distribution feature of term-concept pair in corpus. The experimental results of semantic-based search on three different corpuses show that compared with classical methods, semantic-based query expansion on the basis of K2CM can improve search effectiveness.(2) Propose a Term-Concept Association Based Language ModelNowadays, more and more documents are labeled by concepts of ontology. According this fact, we propose a TCA-LM (Term-Concept Association Based Language Model) for information retrieval. The main idea of TCA-LM is to view a document as two parts: one is a document block with semantics, and the other is a document block without semantics. We call the document block with semantics as semantic block, and call the document block without semantics as non-semantic block. For a document, we compute the query-likelihood of its two parts respectively, and sum the both up as the query-likelihood of the document. For non-semantic block, we assume the terms in it are all independent, and compute it's likelihood with query by classical language model. For semantic block, we compute its likelihood with query by the language model smoothed by term-concept association since it is composed of the concepts of ontology. The experimental results on public dataset show that TCA-LM can benefit search effectiveness.(3) Propose a Novel Method to Measure Semantic Association between Two Concepts in Domain OntologyIn domain ontology, semantic association (SA) is used to depict the correlation between two concepts. However, there is usually no weight assigned to the link between two concepts in the ontology, which has been considered as one of the main obstacles in using ontologies. In this paper, we define DSA (Degree of Semantic Association) for SA to describe the semantic association from a concept to its direct-related concept in the domain ontology. DSA comes from the intuition that each type of semantic relationship implies semantic association at a certain extent. A Method called SRbM (Semantic Relationship based Method) is given to compute DSA based on semantic relationships. DSA is developed from a probability Py(x|r), where x and y are two concepts in ontology and r is the semantic relationship from y to x. MDSA is also developed to describe the semantic associatated degree of any two concepts in ontology. Experiment results of semantic retrieval open data show that semantic query expansion based on MDSA performs better than classic semantic query expansion.(4) Model User's Cognitive Structure in Contextual Information RetrievalIn contextual information retrieval, the retrieval of information depends on the time and place of submitting query, history of interaction, task in hand, and many other factors that are not given explicitly but lie implicitly in the interaction and surroundings of searching, namely the context. User's cognition is one of important contextual factors to help understand his or her personal needs. We propose a model called DbSAM to get user's individual cognitive structure on domain knowledge. DbSAM is inspired by the spreading-activation model of psychology and is built on the domain ontology, while its goal is to get user's cognitive structure. Cost analysis of construction algorithm shows that it is feasible to get cognitive structure by DbSAM (Domain based Spreading-Activation Model), and personalized search experimental results on digital library indicate that application of cognitive structure can improve the search effectiveness and user's satisfaction.
Keywords/Search Tags:Contextual Information Retrieval, Statistical Language Model, Domain Ontology, Concept, Semantic Association, User Context, User Cognitive Structure
PDF Full Text Request
Related items