Font Size: a A A

Query Optimization Using Topic Modeling And Word Embeddings

Posted on:2020-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y D SongFull Text:PDF
GTID:2428330575967949Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development and application of search engine technology has changed the way people access information,making people increasingly dependent on search engines in their daily lives.However,in information retrieval,due to the short query and the unclear query intent of users,the document returned by the information retrieval system often does not conform to the user's search intention.In order to improve the retrieval effect,search engines generally adopt query optimization technology,including query expansion and query recommendation.In traditional query optimization ways,the pseudo-related feedback is an effective method,but the topic drift problem in pseudo-related feedback often degrades the retrieval performance.For query expansion,after the expansion word is obtained from pseudo-correlation feedback documents,it is often simply spliced into the original query.This method does not measure the correlation between query words and expansion words,and it will affect the ordering of the returned documents.For query recommendation,since the search tends to be professional,how to extract terminology from the pseudo-relevant feedback document for recommendation and how to obtain the semantic relationship between query words and recommendation words become important research issues.Therefore,this paper mainly carried out the following three aspects of research.1,Proposing a topic inference strategy to solve the topic drift problem in pseudo-related feedback method.First,this paper uses scoring strategy based on language model to get feedback documents,models the document through LDA topic model.Then the methods based on Gibbs sampling and word embedding are used to infer the topic of the query,and then the method of getting candidate words based on the topic model is improved.Experiments show that the method based on word embedding describes the query in more aspects from the perspective of semantics,and reflects more semantic information.2,Using the weight calculation method to optimize the document scoring strategy in query expansion.First,applying the topic inference strategies to the selection of expansion words,and then calculates the feature of the candidate expansion words through the semantic features and statistical features obtained by word embedding,and assigns them different weights according to the feature values,and finally the second search results are returned.Experiments show that assigns different weights to expansion words can further improves the retrieval effect.3,Proposing a terminology recommendation method to further enhance users experience.First,using the terminology dictionary to extract terminology documents from pseudo-correlation feedback documents,and the candidate terminologies are obtained by topic inference strategies for recommendation,and then this paper establishes a relationship recognition algorithm,fuses the supervised and unsupervised methods to mine the semantic relations between query words and recommended words.Finally,the words with semantic relation are recommended to users.The experimental results show that this paper's method can meet users'search needs better.
Keywords/Search Tags:Query expansion, query recommendation, topic modeling, word embeddings
PDF Full Text Request
Related items