Font Size: a A A

Research And Implementation Of User Click Feature Reconstruction Method For Academic Search

Posted on:2019-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2428330593450279Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
How to improve the performance of search engines,especially for specific information retrieval fields such as academic search,is a problem that has wide application prospects and is full of great challenges.It is generally believed that users' feedback using search engines,such as their click behavior,is closely related to their search intentions.Especially in academic search,users want accurate search results and will participate more in the information retrieval process,resulting in a great deal of information feedback behavior.Based on this,the user's click behavior can be used to infer the correlation between the document and the query to improve search performance.Unfortunatel,a large amount of user feedback is concentrated in a few queries in practice,and most queries lack users' feedback.User feedback is often rich in noise,too sparse,or completely missing.Therefore,how to rebuild user feedback information to make it more realistic,more effective,and more dense,has become a bottleneck in current work.At present,after analyzed the user's click feature,there are the following problems to be solved:1.How to maintain the effectiveness of data while increasing the density of click features? If we regard the query and document as a matrix,the rows correspond to the query,the column corresponds to the document,and the value of the element in it corresponds to the click value of the document under the query.Then the click matrix is a sparse matrix.In the existing methods,the matrix reconstruction method provides us with new ideas.Therefore,how to use the matrix reconstruction method to improve the density of the click matrix needs to be solved.How to fully tap the relationship between the queries or documents,and use their relationship to perform the reconstruction of the click matrix is a challenge,because the information in the click matrix is very limited.2.For a high-dimensional click matrix,how to build an efficient reconstruction algorithm? The matrix reconstruction method needs to perform a large number of operations on the matrix,so we need to consider how to complete the matrix reconstruction in an acceptable time space for a large-scale click matrix.In order to solve these problems,we propose a framework for the user's click feature reordering for academic retrieval:1.The low-rank matrix decomposition model is widely used in matrix reconstruction,it can solve the sparsity problem of the matrix and maintain the original matrix characteristics when increase the density of the matrix.In the homophily model,the homophily regularization term is used to constrain the relationship between each query and each document to make consistent clicks for similar documents under similar queries,which can be solved the problem of noise and drift of data.2.Combining query grouping and block-based non-negative matrix factorization.The general non-negative matrix factorization method has limited performance in reconstructing matrices.So we use a special block-based non-negative matrix factorization for iterative solution.Before iterative solution,we divide the highdimensional click matrix into small matrices according to the query to improve the reconstruction efficiency of the matrix.3.Ranking academic search results based on learning to rank.We use the learning to rank model to train the sorted data to compare the performance of document retrieval after the user's click feature reconstruction under many different sorting models.In this paper,experiments are conducted on Microsoft's academic search dataset.And we compared the academic search engine ranking performance using the original click feature and the reconstructed click feature.This proves that our proposed method can effectively rebuild the click feature to improve the search performance of academic search engines.
Keywords/Search Tags:click feature, low-rank matrix factorization model, homophily model, drift
PDF Full Text Request
Related items