Font Size: a A A

Research On Feature Interaction And Knowledge Distillation Based CTR Prediction Models

Posted on:2023-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y B XieFull Text:PDF
GTID:2558307058999519Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Click-through rate(CTR)prediction is widely used in recommender system,search engine and computing advertising.It can measure the probability of a user interacting with a promoted item.CTR prediction is critical,because even a small increment of CTR could bring a lot of benefits for applications with a large user base.But it is still challenging.On the one side,most CTR prediction models just simply concatenate the input feature embeddings and then feed them into the following DNN models.Although this method can make use of the great fitting power of DNN,it suffers from rough utilization of input features and the loss of structural information of features,which limits the prediction accuracy.To this end,this thesis proposes to apply GNN to enhance the learning of feature interactions,which can maintain the structure of feature nodes during the modelling process.DNN is also used to model high-order feature interactions sharing the same embedding layer with GNN.Adaptive fusion is used to combine DNN and GNN dynamically and make the final CTR prediction result.On the other side,advanced CTR prediction models focus on designing sophisticated feature interaction layers.Although they can achieve better performance,they cannot be applied in large-scale industrial applications,because these sophisticated models often need a large amount of computation and have low efficiency.Thus,this thesis proposes a multi-level knowledge distillation(KD)training method to transfer the ability from advanced heavy teacher models with complicated feature interaction layers to light student models.Then the accurate and efficient light model can be deployed in industrial applications.The multi-level KD training method is composed of three parts: soft label,hint regression and knowledge regression.They can help the student model learn from the teacher model thoroughly from the output layer,hidden layer and embedding layer.Detailed experiments are performed in Avazu and Criteo datasets.Hyperparameters and key components are studied.The experiment results indicate the effectiveness of the proposed models: the graph feature interaction enhanced neural network and the multi-level knowledge distillation-based model.This thesis designs and implements a deep learning-based personalized movie recommender system prototype based on the MovieLens dataset and the proposed model.The system modules and recommendation results are shown,indicating the effectiveness of this system and the application value of this thesis.
Keywords/Search Tags:deep learning, CTR prediction, recommender system, graph neural network, knowledge distillation
PDF Full Text Request
Related items