Font Size: a A A

Information Retrieval And Query Recommendation For Information Precise Service

Posted on:2017-01-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:F CaiFull Text:PDF
GTID:1318330536467204Subject:Army commanding learn
Abstract/Summary:PDF Full Text Request
As the availability of military information becomes relatively easier than before,the amount of military information is increasing dramatically,which brings the burden to today's military information systems.In particular,how to build,develop and maintain such information systems is challenging.In addition,as we know,an effective approach of organizing,mining and analyzing available military information can lead to optimize current military information systems.Thus,optimizing current military information systems can make the information we have more valuable and results in an improved user's satisfactory.The main purpose of this thesis is to optimize the current military information systems,which is based on available data mining technologies and is split into two main directions,i.e.information retrieval and query recommendation.We target to address the main problems in the field of information retrieval and query recommendation,which are being studied in both academic community and industry.In particular,in the field of information retrieval,we aim to answer the following questions:(1)How to incorporate user's feedback into current information retrieval models for boosting the performance?(2)Given a query,how to generate a personalized ranking of documents via analyzing user's behaviors? Regarding the query recommendation,we aim to answer the following questions:(1)How to inject the semantic similarity into current query recommendation approaches to improve the performance?(2)How to reduce the redundancy of the list of query recommendations to improve user's satisfactory? In answering these questions,we conduct a comprehensive investigation on publicly available datasets.In general,our main contributions in this thesis are summarized as follows:(1)We propose an information retrieval model based on rules mining.In this thesis,we propose a novel model which clusters the patterns in the training data of query-document pairs with their relevance from users,and then uses the discovered rules to rank documents at querying time.We conduct a systematic evaluation of the proposed method using the LETOR benchmark dataset.Our experimental results show that the proposed method outperforms the state-of-the-art methods without notable timeconsuming and laborious pre-processing.(2)We propose an information retrieval model based on logistic regression.In this thesis,we propose a Virtual Feature based Logistic Regression(VFLR)ranking method that conducts the logistic regression on a set of essential but independent variables,called virtual features(VF).They are extracted via the kernel principal component analysis(KPCA)method with the user's relevance feedback.We then predict the ranking score of each queried document to produce a ranked list.We systematically evaluate our method using the LETOR 4.0 benchmark datasets.The experimental results demonstrate that our proposal outperforms the state-of-the-art methods in terms of Mean Average Precision(MAP),Precision at position k(P@k),and Normalized Discounted Cumulative Gain at position k(NDCG@k).(3)We propose a personalized information retrieval model based on user behavior analysis.In this thesis,we focus on a different type of signal: we investigate the use of behavioral information for the purpose of search personalization.That is,we consider clicks and dwell time for re-ranking an initially retrieved list of documents.In particular,we investigate the impact of distributions of users and queries on document re-ranking;then estimate the relevance of a document for a query at two levels,at the query-level and at the word-level,to alleviate the problem of sparseness;and perform an experimental evaluation both for users seen during the training period and for users not seen during training.For the latter,we explore the use of information from similar users who have been seen during the training period.We use the dwell time on clicked documents to estimate a document's relevance to a query,and perform Bayesian probabilistic matrix factorization to generate a relevance distribution of a document over queries.Our experiments show that: for personalized ranking,behavioral information helps to improve retrieval effectiveness;and given a query,merging information inferred from behavior of a particular user and from behaviors of other users with a user-dependent adaptive weight outperforms any combination with a fixed weight.(4)We propose a query recommendation model based on semantic similarity and time-aware term popularity.Based on the Markov assumption,we propose a new query recommendation ranking method,which models user's query recommendation engagement as a Markov Chain and takes the semantic similarity between terms into account.We contrast our proposed model with the traditional query popularity-based query recommendation approaches and verify its effectiveness in terms of Mean Reciprocal Rank(MRR).The experimental results show that our model significantly outperforms the baselines,achieving an average MRR improvement around 4% over the baselines.(5)We propose a greedy query recommendation approach for diversifying query commendations.In this thesis,we propose a greedy query selection model(GQS),which tends to return the correct query early in the candidate list and reduce the redundancy among the candidates as well.In particular,based on the query-aspect level diversification,we detect the query intents implicitly expressed by previous search behaviors in current session,from which we inject the query to the final list to maximize the probability of satisfying the average user with finding at least one acceptable query candidate in the list.We quantify the improvement of our model against the baselines using two large-scale real-world query logs and show that it beats the competitive baselines in terms of mean reciprocal rank(MRR)and ?-nDCG.
Keywords/Search Tags:Optimization of information systems, information retrieval, relevance feedback, personalized ranking, user behavior analysis, query recommendation
PDF Full Text Request
Related items