Font Size: a A A

Social Network Harmful Speech User Identification Research

Posted on:2020-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhongFull Text:PDF
GTID:2428330590954697Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a fast-developing social network platform in recent years,Weibo has been favored by many people because of its simple operation,rapid spread and high flexibility.Users can freely express their opinions and opinions and vent their emotions.At the same time,however,many illegal users violate the platform management norms,wantonly publish harmful to national interests,party and government construction,harmful to national unity,social stability,vulgarity,pornography,obscenity and other harmful statements,polluting the social network environment,disrupting the normal marketing and promotion of social platforms,and causing adverse effects on society.Therefore,identifying and detecting users who publish these bad comments in social networks plays an important role in purifying the network environment,maintaining the network order,improving the user's online experience,and promoting the harmonious development of society.The key to identifying harmful users is to accurately detect whether the text content published by the user contains harmful information.Therefore,the focus of this paper is on the detection of harmful speech.The existing detection of harmful speech is generally carried out for long text or short text or a mixture of long and short texts.It does not take into account that different lengths of text have different characteristics,resulting in different text characteristics.From the perspective of the following,I did the following work:1.According to the different characteristics of abnormal users of social networks,seven categories of abnormal users in social networks are divided,and each type of users is summarized and summarized.2.Text preprocessing.The traditional text preprocessing method cannot eliminate the interference of noise data on bad speech detection.Aiming at the characteristics of bad speech,this paper preprocesses the text from the aspects of text denoising,word segmentation processing and stop word filtering,and proposes a kind of harmful word expansion method based on the similarity of word vector.3.Text multi-feature extraction and fusion.Because the short texts collected in the bad texts are mostly,the number of long texts is relatively small.Therefore,this paper analyzes the characteristics of bad texts from the perspective of text length and finds that the short short texts have short words and colloquial expressions.Serious and spelling changes,long texts have many words and complex sentences.Therefore,for the characteristics of texts of different lengths,three features are extracted: Bi-gram features,emotional features and deep text features,and feature fusion is performed to obtain a vector representation of the text.4.Experiments were carried out on the collected data sets using the features extracted in this paper,and their experimental results in harmful speech detection were verified.Then,the model is applied to the recognition of harmful speech users.Experiments show that the multi-feature fusion model proposed in this paper has a significant improvement in accuracy and F value in harmful text detection and user recognition.
Keywords/Search Tags:Social network, harmful speech user, harmful speech, feature extraction, neural network
PDF Full Text Request
Related items