Font Size: a A A

Query Optimization Based On Word Embeddings Model

Posted on:2022-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:X FangFull Text:PDF
GTID:2518306494471444Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the Internet era,search engines have begun to be widely used.In the task of information retrieval,for unpopular queries,the search engine cannot retrieve the required data due to the narrow range of the user's search terms,thereby reducing the user's experience.At this time,the query optimization system can effectively assist the search engine to provide reliable services.Common query optimization methods include query expansion and query recommendation.Query expansion can be expanded according to the user's original query content,and all the expanded information will be integrated into the information retrieval system,thereby improving the system's recall rate.Query recommendation can make relevance associations based on the original query content entered by the user,thereby recommending candidate queries that the user may wish to obtain.Both query optimization methods need to obtain semantic information that is highly relevant to the original query.The word embedding model has the characteristics of extracting semantic information from a large amount of text,so this paper combines the word embedding model and query optimization model to study to obtain more semantic information.In this paper,the following three aspects of research are carried out on the abovementioned content:(1)The Semantic-Relevance Model is proposed based on the word embedding model,which can extract deeper semantic information between words from the corpus rich in semantic information.These deep semantic information can provide more comprehensive and effective feature support for the query optimization system to analyze the semantic relationship between words.Extract local semantic correlation distributions from semantic data such as the synonym forest and language knowledge base "HowNet" semantic original annotation information,and use the deep mining ability of the neural network model to fit the local semantic correlation distribution of each word in the corpus space into global semantic correlation distributed;(2)Use a variety of algorithms to extract features such as part of speech,word frequency,and word length of the words in the query,and propose a fusion algorithm to organically fuse various features.The extracted word features are combined with The Semantic-Relevance Model to realize the query expansion task,and the adjustment experiment is carried out for various parameters in the algorithm,and the effect of the model is constantly iteratively optimized.In the query experiment based on a large number of web documents,the query expansion model based on The SemanticRelevance Model is better than the traditional Word2vec model;(3)A coarse-grained text similarity calculation algorithm and a fine-grained query vector similarity calculation algorithm are proposed,and The Semantic-Relevance Model is merged with two similarity calculation algorithms to conduct query recommendation experiments.The experiment obtains candidate query recommendations through a large number of query logs,and then calculates the recommendation matching degree based on the similarity calculation algorithm.The experimental results show that the query recommendation method based on The Semantic-Relevance Model is more effective under the premise of selecting an appropriate threshold.
Keywords/Search Tags:query optimization, query expansion, query recommendation, word embedding
PDF Full Text Request
Related items