Font Size: a A A

The Research On Query Understanding And Positive-Negative Relevance Feedback Approaches

Posted on:2017-10-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L MaFull Text:PDF
GTID:1318330488993442Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
One of the vital superiorities of the Internet is extensive and quick sharing and exchanging of information. While with an explosion of exponential statistics and information on the Inter-net, people have to confront the ever-increasing problem of information overload. Therefore, information retrieval technique emerges and surges with the rapid development of Internet, now it becomes the most direct and effective approach to solve the information overload. The tech-niques of query understanding and relevance feedback have been widely verified as an effective way to improve the performance of information retrieval in recent years. Despite some signifi-cant progresses now, many key problems are not handled well. Targeting the present limitations, the thesis studies and analyzes the classic and up-to-date approaches of query understanding and relevance feedback first, and then focuses on the deep study of query understanding techniques and relevant feedback techniques which are based on the former. The main study and contribu-tions of the thesis is as follows:1. In order to solve the problem of term weight prediction in query understanding, we turn it into a problem of sequence tagging and propose a novel model of learning to weighting query terms based on the recurrent neural networks. In terms of the statistics, grammar and semantic information and the relationship among the query terms, the method combines the genetic algo-rithm and real relevance judgement of documents to acquire the optimum of term weights. Then it utilities the bidirectional recurrent neural networks to model the relationships between the se-quence of query terms and its accordingly optimal weight sequence by supervised learning and predicts the weights of those items effectively, reasonably and automatically. The experimental results demonstrate that the weights obtained by the proposed method can improve retrieval per-formance significantly and it is also superior to other baseline methods in precision on multiple standard datasets.2. Aiming at the problem that the existing query intention classification methods are gen-erally dependent on manual annotation data and is not flexible enough for taxonomy changing, the thesis turns the query intention classification task into a two-stage problem consisting of sequence classification and classic one, and proposes a novel method of query intention clas-sification based on hybrid deep learning. The method first, with the perspective of increasing flexibility and efficiency, builds a hybrid deep neural network to establish a two-stage classifier for query intention. Then in order to reducing the dependence on manual labelling, it mines the labelling behavior of real users using the implicit relevance feedback technique, and builds the training data for classification automatically. The experimental results demonstrate that the query can be classified by the proposed method with the intention, and the efficiency of classification is much better than other baseline methods.3. Targeting the weaknesses of the existing query expansion methods with pseudo relevance feedback, the thesis, for utilizing the search engine query-log as feedback information to expand queries, proposes a novel implicit feedback method based on two-stage SimRank algorithm and query expansion technique. This method first introduces the weight of edges to revise the original SimRank which is a graph-based algorithm. Then, the similarities and semantic correlation of terms are discovered on the term-relationship graph which obtained by some transition from the query-click graph, so as to select high-quality expansion query terms. Just as the experimental results on several public standard datasets indicate, the proposed method can select appropriate expanded query terms and make the information retrieval more effectively.4. Aiming at the problem that the existing relevance feedback methods have not considered both the positive and negative relevance information simultaneously. Thus, with the view of us-ing implicit feedback, as well positive and negative relevance information, the thesis proposes a novel multiple feedback framework based on the language model. By analyzing the bidirectional relevant documents of implicit feedback in the scenarios of difficult query, the method is to build both positive and negative relevance model, and to apply positive model for an enhancement of negative one. It can simultaneously promote the relevant documents and filter irrelevant ones as far as possible, so the retrieval performance will be improved. Experimental results on several TREC collections show that the proposed relevance feedback method is generally more effective than both the baseline method and the methods using only positive or negative model, and the retrieval performance is significantly improved.With the four aspects of the study above, a solution which adopts query understanding and relevance feedback technique to strengthen the whole procedure of information retrieval will be obtained, and help boosting retrieval performance and improving user experience.
Keywords/Search Tags:Information Retrieval, Query Understanding, Relevance Feedback, Machine Learning
PDF Full Text Request
Related items