Font Size: a A A

Research On Cold Start Problem In Click-through Rate Prediction For Search Advertising

Posted on:2017-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:L F DengFull Text:PDF
GTID:2308330509457107Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As a new form of advertising, online advertising has become one of the most important revenue models of network marketing. Search advertising is the largest and fastest growing form of online advertising, it has ocupied more than half of the whole online advertising market. Predicting CTR is the most critical technology for search advertising. For ad with large click history it can predict CTR by utilizing statistical data. But for new/rare ads has serious cold start issue because of lacking of historical data.The new/rare ads cold start issue represents a serious problem in click-through rate prediction. This paper is focused on predicting CTR of new/rare ads, because of the lack of statistical data, we extract new feature that inherits the statistical information of new/rare ads from other frequent ads. The content could be generalize into three parts.Firstly, the processing of large data. Advertising log data is very huge and complicated, we analyze the complex fields of the advertising logs and do related work for dataset pre-processing to correctly grasp the overall distibution of the data. Then introduce the evaluation metris for cold start issue in click-through rate prediction.Secondly, the extraction of high value features. Because of the poor click history, the characteristics of new/rare ads is not stable. The previous approach only consider feature related to the shallow basic feature such as ad text, they don’t mine some potential and latent features. In this paper we present approach that extract two high-value feature, one use Gradient Boosted Decision Trees(GBDT) to produce new feature from some basic features, the other one inherits the statistical information of new/rare ads from other frequent ads, the feature are derived from the query ad Click-Through Graphs based on token.Finally, the rearch on online updating algorithm. Due to the limitation by time and hardware, we use Follow-the-Regularized-Leader(FTRL) based on online learning algorithm as a classification model. Then use Adaboost for model integration. The model can effectively deal with large-scale sparse data and greatly shorten the operation time.
Keywords/Search Tags:click-through rate, search advertising, cold start, click-through graph, online learning
PDF Full Text Request
Related items