As the core functional module of various personalized online services,including online advertising,recommendation systems,and web search,click-through rate prediction can analyze information such as advertisement,user,context,search category,and other information in advertising data.Accurate prediction results are crucial to improve user experience,reducing advertising delivery costs,and increasing media platforms’ revenue.Therefore,designing an effective and accurate prediction model is of great theoretical and practical significance.Since there are few effective features in the click data,more useful information must be obtained through feature interaction.In this thesis,the key issues of feature interaction in the click-through rate prediction algorithm are studied in connection with actual application scenarios and current research frontiers.From the perspective of feature interaction modeling,efficient feature interaction methods are explored,mining the implicit correlation information between features.The low-order feature interaction and high-order feature interaction modeling are studied.From the perspective of user behavioral feature modeling,mining user interest preference information,research user behavior sequence and user interest modeling.In order to enhance the robustness of the prediction model,information entropy is used to quantify the uncertainty while improving the performance of click-through rate prediction.The main innovations and specific research contents are:1.Aiming at the problems that redundant features in the model can bring the noise to the prediction results and the single feature interaction method,a click-through rate prediction model based on feature weighting and feature interaction is proposed.Firstly,the importance of features and feature interactions is measured using mutual information theory and feature domain weighting modules.Secondly,a bilinear function is proposed to learn second-order feature interactions in a fine-grained manner.To further improve performance,a classical deep neural network is combined with a shallow model to simulate higher-order and non-linear feature interactions.By testing on several datasets,experimental results show that the proposed model is able to achieve high prediction accuracy.2.Aiming at the problem of inadequate explicit feature interactions in prediction models,this thesis proposes a click-through rate prediction model based on multi-head self-attention network.Firstly,a multi-headed self-attentive network is proposed to model explicit higherorder feature interactions.Secondly,a combination of bilinear functions and DNNs is proposed to model implicit higher-order feature interactions.At the same time,regulation and bridge modules are proposed to address the problems of over-sharing in the input and under-sharing in the network The validity of the proposed model is verified and interpretability analysis is carried out by conducting experiments on four public datasets.3.To address the problems of imperfect sequential feature modeling and insufficient user interest mining in prediction models,a click-through rate prediction model is proposed based on the interaction of user behavioral features.The model combines users’ short-term and longterm behaviors to capture their preferences and uses an attention capsule network to capture multiple interests from their behavioral history.The performance of the proposed model is validated on the public datasets,and the results show that the prediction accuracy of the approach combining behavioral sequences and multi-interest modeling is greatly improved.4.Aiming at the problems of poor feature representation and lack of uncertainty quantification in prediction models,a click-through rate prediction model based on feature interaction and uncertainty quantification is proposed.A prediction framework combining feature selection and feature interaction is proposed,based on which Bayesian deep learning is used to quantify the uncertainty of the prediction model.The effectiveness of the proposed method is validated on multiple datasets with the ability to quantify uncertainty. |