Font Size: a A A

Research On Prediction Forwarding Of Microblog Based On Hybrid Feature Learning And Filter-Wrapper Feature Selection

Posted on:2021-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:L S ZhangFull Text:PDF
GTID:2518306197455744Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Microblog forwarding is an important way of spreading Weibo information.Research on Weibo forwarding issues is of great significance for hotspot mining,sensitive information control and network public opinion monitoring.Existing research does not provide a comprehensive analysis of factors affecting Weibo forwarding,and the classification of feature categories is not clear enough,ignoring some key features.In addition,most studies apply all the extracted features to the model.The irrelevant or weakly correlated features in the feature not only increase the feature dimension,but also affect the accuracy of the prediction model.Therefore,it has become the focus of research to accurately mine the factors affecting forwarding,extract features with strong correlation,and improve accuracy of forwarding prediction.This paper is based on the analysis of the factors influencing the forwarding behavior of Weibo users,on the real Sina Weibo dataset,extract the characteristics of the influencing factors that affect the forwarding behavior of Weibo users,and a Filter-Wrapper based on information gain rate and weight is proposed.Pattern feature selection algorithm and XGBoost algorithm forwarding prediction method,and through experimental comparison analysis to verify the effectiveness of this prediction method.The main work of this article is as follows:(1)Acquisition and processing of real data on Sina Weibo.It combines the API interface and the method of Python web crawler to crawl the Weibo data,and performs statistical analysis,data cleaning,and sample generation on the data set.(2)Mining important factors that affect users' forwarding behavior and extracting mixed features.In this paper,we take the Weibo receiver as the center to extract user features,microblog features,social features and interest features,and discuss the importance of various features for forwarding prediction.The experimental results show that interest features have the greatest influence on the prediction model.(3)A hybrid feature selection algorithm based on information gain rate and weight is proposed.This algorithm is a feature selection algorithm in Filter-Wrapper mode.It combines the advantages of Filter mode and Wrapper mode,and maximizes the efficiency of the algorithm while ensuring the performance of the selected feature subset.Experimental results show that this method can select a better feature subset and improve the accuracy of the prediction model.(4)A microblog forwarding prediction model based on optimal mixed feature subset and XGBoost algorithm is implemented.First,the Filter-Wrapper feature selection algorithm in this paper is used to select the optimal feature subset,and then the XGBoost algorithm is used to construct the forwarding prediction.The experimental results show that the prediction method in this paper is superior to other forwarding prediction algorithms in accuracy and time consumption.It also shows the importance of feature selection to the forward prediction model.(5)Based on the Python language and Django Web framework,a microblog forwarding prototype system based on mixed features and feature selection is designed and implemented.The system intuitively presents three modules,namely the data preprocessing module,the feature selection module and the forwarding prediction module,which more completely and intuitively displays the algorithm of this paper.
Keywords/Search Tags:Weibo, Forward prediction, Hybrid features, Feature selection
PDF Full Text Request
Related items