Font Size: a A A

Research On The Advertisement Click-Through-Rate Prediction Algorithm Based On Distributed Representation

Posted on:2020-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2428330590450625Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Online advertisement is an important part of the advertising market and one of the most important means of making profits for major Internet companies as well.The ad exchange platforms are billed based on users' clicks.With the real-time bidding strategy,the ad exchange platforms push the advertisement with the highest expected revenue to users.Estimation of expected revenue depends on estimation of click-through rates.Therefore,there is a strong demand for benefits about how to estimate the click-through rate of an advertisement accurately.In such a problem,the data are often high-dimensional with sparse features,and their head part and tail part differentiate distinctly.Therefore,the depiction of high-dimensional sparse features plays a crucial role in the performance of the entire model.However,under the framework of existing models,there are often problems in which high-dimensional sparse features cannot be utilized,or the expressions for the features are insufficient.Therefore,a new algorithm is proposed and used to solve this problem,which is based on distributed representation that can obtain low-dimensional representations of high-dimensional features.In natural language processing,the distributional characteristic means that there is a positive correlation between the similarity of words and the similarity of their distributed representations.Therefore,based on the distributional characteristics,the semantics of words are often learned from large-scale corpus using unsupervised learning method.By analyzing typical dataset,it can be accepted that in a specific scenario,the data in the click rate estimation also have such distributional characteristics,and the their context also exists.A low-dimensional distributed representation of high-dimensional sparse features,that is,hidden vectors of features,can also be obtained through a similar idea.This paper also proposes an applicability criterion for the distributional characteristics,which can measure the diversity between the groups and the coverage of the group context.Through the applicability criterion,a proper combination of features can be found,and a series of meaningful contexts can be generated based on the feature combination.After getting the context,this paper uses modified singular value decomposition method and Skip-gram method for the current task to learn the distributed representation and obtain the low-dimensional hidden vector of high-dimensional features.Three most representative models are selected.Firstly,these three models are trained without distributed representation.Subsequently,low-dimensional hidden vectors are combined to retrain the three models above.By analyzing the experimental results on the AVAZU ad click-through rate prediction dataset from the Kaggle platform,it can be accepted that the method above can effectively learn high dimensional sparse features,and improve the performance of the ad click-through rate predictions.
Keywords/Search Tags:Online Advertisement, Click-through Rate Prediction, Distributional Representation, Dimension Reduction
PDF Full Text Request
Related items