Font Size: a A A

Research On Advertising CTR Forecast Based On Multi-model Integration

Posted on:2020-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:C H WangFull Text:PDF
GTID:2428330590996518Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Advertising is the main source of income for Internet companies.The rapid development of Internet technology and the growth of data volume provide the basis for optimizing advertising.The click-through rate prediction for an ad helps to pinpoint the appropriate set of users and match the best delivery mix.However,the current click-through rate prediction algorithm has a low accuracy rate,and the improvement of the advertisement click-through rate prediction effect can bring more commercial benefits to the Internet company.In the ad click data,there is a problem of category imbalance.Usually only a small number of ad clicks are high,most of the clicks are few,and even won't be clicked.The data imbalance has seriously affected the prediction effect of the model.Moreover,the current use of Internet-based companies is based on a single-model click-rate prediction algorithm.The single-model has limited improvement on prediction,and requires a lot of manual extraction of features,and the time cost is high.In response to these problems,this thesis has carried out research to improve the prediction accuracy from the following three aspects: First,the analysis of the distribution of advertising data,found that there are long tail problems in the advertising click data,by introducing the LS-PLM algorithm.The idea is to use the piecewise linear model to fit the nonlinear classification surface of high-dimensional space,and divide the data into different feature spaces for training and prediction respectively,so that the model can extract the relationship between features more effectively.Secondly,the traditional single-model shallow learning algorithm and the shallow learning integrated algorithm principle and characteristics are studied,and improvements are made on this basis.The improved algorithm combines the advantages of forest model and FM model such as XGBOOST.The TREE subnetwork and FM subnetwork are constructed by cascading,and the shallow nonlinear relationship between features is fully explored,which improves the click rate prediction effect.Thirdly,the algorithm based on shallow learning and deep learning integration in advertising click rate prediction is studied.WIDE&DEEP is an integrated prediction model based on linear model and deep learning proposed by GOOGLE.This thesis combines the previous research on shallow learning to improve the integrated model TDNN for WIDE&DEEP.The model uses the shallow learning network TREE-FM to extract the low-order nonlinear relationship between features,and the deep learning network extracts the higher-order nonlinear relationship,splicing the shallow learning output and the deep learning output,and combining the effective information prediction of the two.,further improving the accuracy rate.The experimental results show that the click rate prediction accuracy rate has been greatly improved by solving the long tail problem of the advertisement;compared with the single model algorithm such as logistic regression and FM,the shallow learning integration algorithm TREE-FM accuracy rate and AUC have Great improvement.The improved integrated algorithm TDNN has the highest accuracy compared with the traditional model,reaching nearly 5 percentage points,and the AUC value is increased by 1~3 percentage points.
Keywords/Search Tags:advertising click-through rate, shallow learning, long tail problems, deep learning, integration learning
PDF Full Text Request
Related items