Font Size: a A A

Research On Personalized Search Via Knowledge Representation Learning

Posted on:2022-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q GuoFull Text:PDF
GTID:2518306563479204Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the scale of information is growing ex-plosively.Information retrieval has become an important way for people to obtain infor-mation from massive data efficiently.Among them,personalized search,as an important task in information retrieval,has become a key step to improve the level of information retrieval and enhance the information retrieval experience,and has been rapidly devel-oped in the industrial field.As an inevitable trend in the development of search engines,personalized search has become an important way for major search engine companies to improve the accuracy of search and improve service quality.Personalized search has also received wide attention in the academic field.At present,there is a lack of widely recognized public datasets in the field of per-sonalized search research.For reasons of privacy protection,information security and trade secrets,the search engine with a large number of query logs can not disclose a large amount of raw data.The publicly available query logs are commonly missing key in-formation,such as missing uniform user identifiers,missing query content or document content.In the aspect of personalized search,the ambiguity of query and the recognition of re-finding behaviors have become the key problems that hinder the development of personalized search.(1)By analyzing the current status of query logs,this paper selects a large-scale dataset named AOL query logs as the data source and proposes a personalized search benchmark dataset construction method.In order to solve the problems of low processing efficiency and lack of space in large-scale data processing,an improved BM25 algorithm is proposed,which improves the algorithm efficiency by more than 6 times.Based on this method,a personalized search benchmark dataset,AOL4 PS,is constructed.Through the statistical analysis of AOL4 PS and the comparison with the existing datasets,the appli-cability and superiority of AOL4 PS in the personalized search task are illustrated.(2)In order to solve the problem of query ambiguity,this paper proposes a personal-ized representation method based on the fusion of knowledge representation,and models the user interest feature by fusing the word embedding feature and the structured informa-tion simultaneously.Further,to solve the re-finding behaviors recognition problem,this paper proposes the personalized search based on dynamic fusion of personalized repre-sentation and query sequence encoding(PRQSE),which encodes user historical behaviors through recurrent neural networks,models user interest feature at the sequence level con-taining temporal information,and captures user re-finding behaviors based on query level and sequence level through a two-layer attention mechanism.(3)In order to verify the effectiveness of the PRQSE model proposed in this paper,it is compared with mainstream personalized search methods on AOL4 PS,and the experi-mental results show that the PRQSE model significantly outperforms the existing person-alized search models in several validation metrics.This indicates that PRQSE model can effectively fuse word embedding information and structured information from knowledge representation learning,learn the personalized representation of queries and encode user historical behavior,and effectively identify user re-finding behaviors,so as to improve the performance of personalized search.
Keywords/Search Tags:Personalized search, Dataset construction, Recurrent neural networks, Attention mechanisms, Knowledge representation learning
PDF Full Text Request
Related items