Font Size: a A A

Research On Social Network Spammer Detection Based On Deep Learning

Posted on:2022-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:L F LiFull Text:PDF
GTID:2518306575463544Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Weibo has become the most popular social media platform for online creation and content sharing in China.As of September 2020,the monthly active users of Weibo reached 511 million,and the average daily active users were 224 million.Weibo can help users collect and disseminate information on the Internet.However,the proliferation of social network spam users has seriously affected the normal user experience and hindered the dissemination of effective information.After preliminary research,the spam users studied in this article mainly include robots with abnormal user account characteristics and marketing accounts that send a large amount of spam.This thesis takes Sina Weibo spammers as the research object,analyzes the account characteristics,behavior characteristics,and Weibo text characteristics of Weibo spam accounts,and uses supervised deep learning solutions for effective spam user detection.The work completed in this thesis is mainly divided into the following aspects.1.Efficiently crawl and store user data on Sina Weibo.The data set obtained in this thesis from Sina Weibo includes 2274 users with a total of 90 days of continuous data.Through different crawling channels,the data set was marked as 446 spam users and 1828 normal users.2.In consideration of class imbalance in real microblog environment,a sampling ensemble algorithm is designed to reconstruct the acquired microblog data and obtain multiple new class balance data sets.3.By analyzing and summarizing the user characteristics in the current Weibo data set,selecting 20 characteristics such as the number of reposts,the number of comments,the number of likes,and the similarity of Weibo content,to construct a new comprehensive feature vector.Train the GRU neural network model based on the obtained class balance data set to realize the detection of spam accounts.4.In consideration of characteristics of Weibo Chinese text,this thesis proposes a hybrid architecture multi-channel convolutional Bi-LSTM model to realize the task of Weibo spam text classification.This thesis adds Bi-LSTM to extract context features based on multi-channel convolutional CNN as the word sense extraction layer,and solves the bottleneck problem of using a single word sense feature extraction model in classification accuracy.The comparative experiment proved that the recall rate of the Multi-GRU model for spammer identification based on user characteristics proposed in this thesis reached80.4%,and the F1 value reached 81.6%,which was higher than other ensemble learning models and neural network models.The text content-based Weibo classification model MCBi LSTM proposed in this thesis has an accuracy of 92.07% on the test set.The recall rate and F1 value of the model are both over 90%,which has a good Chinese text classification effect.
Keywords/Search Tags:social network, spam users, recurrent neural network, convolutional neural network
PDF Full Text Request
Related items