Font Size: a A A

Research Online Expansion Method Of Long Tail Queries For Search Advertisement

Posted on:2018-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiFull Text:PDF
GTID:2348330542490932Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Generally,user's queries follow the power-law distribution,which distributed on the head and the trunk are the common query(also called a high-frequency query)and at the end of the distribution are the long tail query(also called a low-frequency query).According to statistics,the long tail query occupy about 60% of all the independent queries,covering the majority of users.Because a large part of long tail query don't backtrack advertisements,the recall rate of advertisements is lower.Aiming at the problem of low recall rate of advertisements on long tail query,the most common solution in industry and academia is to extend long tail query.As the user behavior data of the long tail query are relatively sparse,the expanding resources are generally obtained from external.But there is a big drawback that the existing methods are not real-time.Besides,the long tail query entered by the user has the characteristics,such as tending to natural language and expression rare or inappropriate,which makes it difficult that the search engine understands the semantic information of the long tail query and matches the advertisement.In this paper,we study the effectiveness of extending the long tail query by using external resources,the mining of semantic information of long tail query and propose the method of Online Expansion Algorithm of Long Tail Query(OEALTQ).The mainly research contents are as follows:Firstly,as existing extension methods expanded long tail queries cause the problem of poor real-time,the online and offline combination method is adopted to reduce the delay of online extension and realizes the online extension of long tail queries.In the offline,it firstly expands the common query,then builds index for the extended common query,maps the long tail query to the relevant common query,finally completes the online extension of the long tail queries.Secondly,because to understand the long tail query is difficult,we mine the intention and semantic features of the query through different methods.According to the characteristics of the long tail query,for those uncommon queries and low frequency words,we can expand them using their related words and similar words.Because of the online requirements,the efficiency should be high on the extraction of these extending words.In this paper,we use the existing knowledge base to meet this requirement.Because the long tail queries that tend to natural language,in order to better understanding their intention,we extract the semantic features of the long tail query through the trained word vectors model.A algorithm,Query Intention Word Extension Algorithm(QIWEA),is proposed.Thirdly,after the long tail query is expanded with some common queries,new advertisements couldn't be triggered,so we introduce bid word cluster set to extend the bid words triggered by the relevant common queries and complete the triggering of the new ads.The Bid Word Clustering Algorithm(BWCA)is proposed.Finally,through the real dataset,we analyze The OEALTQ method proposed in this paper,and verify the effectiveness and availability of the QIWEA algorithm and the BWCA algorithm.
Keywords/Search Tags:Search advertisement, Query extension, Long tail query, Bid word clustering, Word embedding
PDF Full Text Request
Related items