The Research Of Machine Learning Techniques And External Web Resources For Relevance Feedback

Posted on:2012-10-08

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z Ye

Full Text:PDF

GTID:1118330335454687

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the explosive growth of information on the Internet, there is an increasing need for information systems to help users find the resource they need. Information retrieval system is to response this challenge of information overload in general. Its main application, search engine, has achieved great success during the past decade. Extensive experiments have proven that relevance feedback technique is one of the most effective techniques for ad hoc information retrieval. In this dissertation, we mainly explored utilizing machine learning and Web mining techniques to further enhance relevance feedback methods. In particular, the main work of this dissertation can be summarized as follows:(1) For most of the current relevance feedback models, the expansion terms are selected based the document level statistics. However, for a given feedback document, even it is humanly judged to be relevant, may consist of different topics. Obviously, not all these topics are useful for relevance feedback models. We argued that it is more reasonable to conduct relevance feedback on a fine-grained level. Following this argument, a novel topic-based relevance feedback model is proposed in this dissertation, in which three different methods for approaching the query-related topic are explored.(2) In traditional relevance feedback models, each feedback document is treated equally. In fact, the feedback documents are different in quality, therefore will influent the relevance feedback process differently. In order to address this problem, we revisit Rocchio's algorithm by proposing to integrate this classical feedback method into the divergence from randomness(DFR) probabilistic framework for pseudo relevance feedback(PRF). Such an integration is denoted by RocDFR in this paper. In addition, we further improve RocDFR's robustness by proposing two quality-biased feedback methods, called QRocDFR and ReRocDFR.(3) Most existing relevance feedback approaches are based on the assumption that the most informative terms in top-ranked documents from the first-pass retrieval can be viewed as the context of the query, and thus can be used to specify the information need. However, there may be irrelevant documents used in PRF (especially for hard topics), which can bring noise into the feedback process. The recent development of Web 2.0 technologies on Internet has provided an opportunity to enhance PRF as more and more high-quality resources can be freely obtained. (4) Most current PRF approaches estimate the importance of the candidate expansion terms based on their statistics on document level. However, in traditional PRF approaches, the context information is always ignored in traditional query expansion models. Therefore, off-topic terms can also be selected, which may result in a decrease of retrieval performance. In this paper, we propose a context-based feedback framework based on Bayesian network, in which multiple context information can be taken into account.

Keywords/Search Tags:

Text Information Retrieval, Retrieval Model, Relevance Feedback, Machine Learning

PDF Full Text Request

Related items

1	Research On Information Retrieval Technology
2	Research On Relevance Feedback And Long-term Learning In Content Based 3D Model Retrieval
3	Using contextual information and machine learning technique to improve retrieval performance
4	Research On The Relevance Feedback Based On Log Learning For Image Retrieval
5	Studies On Algorithms In Chinese Information Retrieval
6	Image Retrieval Based On Relevance Feedback
7	Based Relevance Feedback Image Retrieval Techniques And Realization
8	Research On Content-Based Music Information Retrieval With Relevance Feedback
9	Neural Network-based Image Retrieval Relevance Feedback Mechanism
10	Image Retrieval Based On Deep Learning And Relevance Feedback