Research On User Malicious Comments Detection Based On Machine Learning

Posted on:2020-07-01

Degree:Master

Type:Thesis

Country:China

Candidate:C Y He

Full Text:PDF

GTID:2428330572473619

Subject:Cryptography

Abstract/Summary:

PDF Full Text Request

With the popularity of the mobile Internet,people can express their opinions in the Internet anytime and anywhere.On the one hand,many media companies need users to actively comments,and on the other hand,many comments are mixed with malicious comments.These comments not only hurt others mentally,but also makes the whole Internet environment become chaotic.More importantly,the attacked will gradually use other products,which is not conducive to the development of the company.Managers need to filter out malicious comments,but small companies can't afford the cost of manual review.Therefore,it is necessary to design a malicious comment automatic review scheme.To solve the above problems,this thesis proposes a malicious comment review scheme based on machine learning.The results are as follows:Firstly,based on the research of "swearwords�curse[li]" by Chinese linguistics,this thesis select 40 seed words,and then obtain a malicious dictionary by extending the algorithm.Compared with manual dictionary selection,our method saves a lot of labor costs.In addition,the dictionary can also be used as a custom dictionary for Chinese word segmentation to improve the accuracy of word segmentation.Secondly,the thesis analyzes the news topic of each user's historical comments.We use LDA model to extract the theme of news content,and use "user id","user comment" and "news content of comment" as the input of the RNN model.Experiments show that the improved scheme improves the detection effect of malicious comments.Finally,the results of the previous two parts are combined with the characteristics of traditional malicious comment system.The thesis extracts 13 types of features from the dataset,calculates Pearson correlation coefficients and analyzes features.At the same time,features are used as input of decision tree algorithm and SVM algorithm respectively,and a network malicious comment detection model is designed.Experimental results show that the decision tree algorithm achieves the optimal effect,and the classification result F1 value reaches 92.87%.Therefore,the scheme designed in this thesis can help users to complete the comment detection.

Keywords/Search Tags:

Chinese linguistics, Malicious dictionary, LDA model, Machine learning

PDF Full Text Request

Related items

1	Research On Technology Of Malicious Users Identification Based On Weibo Content
2	Research On Malicious URL Detection Based On Machine Learning
3	Research And Application Of Uyghur-chinese Machine Translation Model Based On Deep Learning
4	Sentiment Analysis Of Chinese Reviews Based On Domain Dictionary And Machine Learning Methods
5	Design And Implementation Of JavaScript Malicious Code Detection Model Based On Machine Learning
6	Research On Malicious URL Detection Technology Based On Machine Learning
7	Research On Malicious URL Recognition Based On Machine Learning And Its System Implementation
8	Research On Malicious Web Page Recognition Based On Feature Fusion And Machine Learning
9	The Research On Chinese Microblog Sentiment Analysis Based On Rules And Machine Learning Methods
10	The Research On Writing Style Modeling Method Based On Supervised Learning