Research On Cyberbullying Detection In Social Media

Posted on:2021-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:N J Lu

Full Text:PDF

GTID:2428330605450797

Subject:Information security

Abstract/Summary:

PDF Full Text Request

Cyberbullying in social media often has a bad influence.Effective detection of cyberbullying has important social and academic implications.It is difficult to learn bullying features,due to the user-generated content of web text,including spelling errors,grammatical errors and other noises.Therefore,cyberbullying detection has remained a difficult and unsolved problem for academics.In order to improve the accuracy of cyberbullying detection,this paper establishes a neural network model,learns character combination features and semantic features,and introduces Shortcuts to fuse the above features,effectively avoiding the interference caused by noise in user-generated content.The main work and innovations of this paper are as follows:(1)In order to provide Chinese dataset for cyberbullying detection tasks,this paper builds an open source Weibo dataset,which enables the detection model to learn Chinese cyberbullying features from real-world scenarios.Data are collected from Weibo users' comments on public figures who caused controversial events or bad reviews,and manually labeled to make the dataset suitable for supervised text classification tasks.(2)In order to learn the features of cyberbullying,this paper proposes the Char-CNNS model.For the noise problem in user-generated content,the model learns the character combination features and semantic features,and introduces a shortcut strategy to fuse features of different neural network levels.The results of cyberbullying detection experiments on Weibo datasets and Twitter datasets show that the Char-CNNS model is superior to TF-IDF+SVM,N-gram+LR,CNN in Precision,Recall,and F-measure indicators.(3)In order to reduce the interference of data category imbalance on cyberbullying detection,this paper combines cost-sensitive method to improve the robustness of the model through Focal Loss function.The experimental results show that,compared with the Cross Entropy function,the Focus Loss function has a stable performance,and the F1 value increases by 2.7%,2.9%,4.5%,3.4%,and 8.3% respectively in the datasets of five different categories(positive and negative cases are 1:1,1:2,1:5,1:10,and 1:20,respectively).

Keywords/Search Tags:

cyberbullying, user-generated content, text classification, convolutional neural network

PDF Full Text Request

Related items

1	Text Classification Tool Of Game Community Content Management
2	Research On Retrieval Methods In Social Networks Based On User-Generated Content
3	Research On The Organization Of Text User Generated Content Based On Linked Data
4	Research On News Text Classification Based On Convolutional Neural Network
5	Research On Text Classification Algorithm Based On Mixed Convolution
6	Research And Implementation Of Text Classification Method Based On Convolutional Neural Network And Topic Model
7	Research On Text Classification Based On Improved Convolutional Neural Network
8	Application Of Improved Convolutional Neural Network Models In Text Classification
9	User Generated Content Quality Evaluation Based On Text Analysis
10	Research And Implementation On Chinese Text Classification Algorithm Based On Convolutional Neural Network