| With the slowdown of the Internet industry,advertising revenue,which is one of the important revenues of major Internet platforms,has also declined significantly.How to find a grip to further enhance revenue through refined operation is now a problem that needs to be solved.At present,the mainstream advertising billing method is closely related to the amount of advertising exposure.Therefore,ad exposure estimation is a technology that can effectively improve the efficiency of ad placement.However,in practical business scenarios,ad exposure prediction remains a challenging task due to problems such as lack of appropriate feature representation under targeting conditions,no historical data for new ads,and a single model that is not well suited to complex scenarios.In this thesis,we focus on two aspects: feature engineering and algorithmic modeling to estimate the daily exposure of advertisements.In terms of feature engineering,this thesis first extracts valid data from a large dataset to build a training set;then,according to the business context of improving the efficiency of optimising advertisements,this thesis purposely builds descriptive features of advertisements,advertisers,products and other related objects.Finally,we also propose solutions from the feature engineering perspective to address the problems of inaccurate prediction due to the lack of historical information of new advertisements in the test set and the discontinuity between the test and training sets.In terms of algorithm model,aiming at the defects of feature extraction in the current extreme deep factorization machine model,the thesis proposes three advertising exposure prediction schemes of extreme deep factorization machine based on enhanced feature extraction.Firstly,aiming at the lack of information extraction ability in the embedding layer of extreme deep factorization machine model,the thesis proposes an advertising exposure prediction algorithm of extreme deep factorization machine based on graph embedding.This thesis introduces mask based graph embedding into the embedding layer of extreme deep factorization machine,which can better describe the relationship between users,advertisements and commodities;For the model,the introduction of graph embedding enhances the feature extraction ability of the embedding layer,and the addition of mask operation avoids the problem of new advertising data disclosure.Secondly,to address the problem of redundant features exist in the process of constructing higher-order crossover features in the extreme deep factorization machines to affect prediction,this thesis proposes an attention mechanism-based algorithm for predicting the advertising exposure of the extreme deep factorization machine.In this thesis,an additive attention mechanism is added to the output layer of the compressed crossover network of the feature crossover module of the extreme deep factorization model.On the one hand,the additive attention mechanism can better learn the weights of different orders of crossover features,thus improving the accuracy of the model;on the other hand,the time complexity of the additive attention mechanism is linear and does not add additional time complexity to the model.Finally,to address the problem of insufficient extraction of classification features by the extreme deep factorization model,this thesis proposes an extreme deep factorization advertising exposure prediction algorithm based on model fusion.By introducing a decision tree model to fuse with the two improved extreme deep factorization models mentioned above,the feature extraction capability of the model is enhanced,thus improving the model prediction accuracy.In this thesis,the three proposed improved advertising exposure prediction algorithms of the extreme deep factorization are subjected to extensive comparison experiments on a test set.Experimental results show that the proposed advertising exposure estimation algorithm has better performance compared to other models. |