Font Size: a A A

Research On User Portrait Algorithm Based Userondynamic Networkinterest Model Behavior

Posted on:2020-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y MeiFull Text:PDF
GTID:2428330575494686Subject:System theory
Abstract/Summary:PDF Full Text Request
Along with the widespread acceptance of the Internet by the entire society,especially the mobile Internet,user-generated data has exploded.People leave a lot of behavior data on the network every day,such as query words,web page access records,etc.These network behavior data are rich in variety and time-sensitive.That provide ample data resources for analyzing user preferences and personal attribute information,and building user portrait models.As the basis of enterprise big data,user portraits make full use of user behavior record data,portray the whole picture of user attribute information and efficiently construct user portraits,which will help enterprises to achieve precise marketing and personalized service.Using algorithmic models to predict labels has become a hot research direction for user portraits,because the traditional method of manually labeling user images is inefficient.However,the current mainstream machine learning algorithms failed to dig deep into the complex relationships between features.In the case of high-dimensional and sparse features,the prediction effect is still unsatisfactory.And the promotion space is relatively large.The hybrid algorithm can often combine the advantages of each algorithm to overcome the defects to some extent and improve the prediction accuracy.In order to realize the prediction task of the user multi-dimensional population attribute label,the user's image construction method is deeply studied with the user's query record data.The research work is summarized as follows:1)The two-layer integrated learning framework based on random forest algorithm is proposed.In the first layer model,six traditional machine learning algorithms are used as the user query word feature extractor,and combined with the user's digital features as the input of the second layer model.In the second layer model,The Random Forest Algorithm is used as a classifier to fit different strategies in order to mine hidden information between the user portrait and the user's demographic attribute tag.Finally,it is proved by experiments that the two-layer integrated Learning Algorithm based on Random Forest Algorithm has better generalization ability and predictive ability than a single base model.2)The two-layer integrated learning framework based on XGBoost algorithm is proposed.According to the characteristics of the user query term materials,the Doc2 Vec is improved by analyzing and comparing the advantages and disadvantages of the three commonly used document vector representation methods.And combined with BP NeuralNetwork Algorithm,the BPDM(BPNN based Doc2 Vec Model)algorithm model is proposed to extract the complex features between word meanings.The BPDM algorithm and the two-layer integrated learning algorithm based on the random forest algorithm are used as an extractor for user query word features in the first layer of the framework.In the second layer model,the XGBoost algorithm is used as the classifier to construct the user portrait model.The experimental results show that the proposed method can reduce the feature dimension effectively,enrich the diversity of the base classifier,and predict the multi-dimensional population attribute of the user portrait model efficiently and accurately.
Keywords/Search Tags:User portrait, Network behavior, Ensemble learning, Tag prediction, Machi ne Learning, XGBoost
PDF Full Text Request
Related items