Research On Network Undesirable Text Filtering Based On Social Platform

Posted on:2022-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Wang

Full Text:PDF

GTID:2518306347951339

Subject:Computer Science and Technology

Abstract/Summary:

With the rapid development of Internet technology,people are getting closer and closer through social platforms.Because of the virtuality and immediacy of social platforms,all kinds of complicated and chaotic information are spread freely on the Internet.Short text is the main form of dissemination of bad information on social platforms.How to accurately and effectively filter these bad texts circulating on social platforms is a very socially meaningful research work.Natural language processing technology provides a feasible solution for the problem of network bad text filtering,but the text expressions spread on social platforms often do not conform to the language rules,and there are also a large number of word variants,emotional language blending,etc.,which makes the network bad text filtering The work has a certain complexity.Aiming at this difficulty,this paper proposes a hybrid deep learning framework for network bad text filtering method,adopting a label enhancement strategy based on reinforcement learning,so that the model can understand the purpose of label classification more quickly,so as to effectively realize the filtering work of network bad text.The main research work is as follows:First,a method for filtering bad texts on the network is proposed by fusing the pre-training model BERT and the convolutional neural network.This paper transforms the filtering of network bad language into short text multi-classification.Traditional short text classification requires several steps:feature selection,feature extraction,and classifier classification.The features extracted by machine learning methods are often less applicable,leading to poor classification results.In response to this problem,this paper proposes a hybrid deep learning bad text filtering framework BERT-CNN that integrates the pre-training model BERT and Convolutional Neural Network(CNN).Experimental results show that the model can effectively improve the performance of bad text classification on the test data set.Secondly,a method for filtering bad texts on the web based on reinforcement learning and label enhancement is proposed.Short text is limited by its length,resulting in less documented semantic information.Since the number can only represent the index during classification,the number label also loses part of the semantic information,and the rich semantic information implicit in the label needs to be mined.Therefore,this paper proposes two label enhancement strategies:one that extracts a subset from the input text as an extended label and one that uses a generative model to generate tags randomly.Experimental results show that bad text classification methods based on reinforcement learning and label enhancement can effectively reduce the error rate of text classification.Based on the above research,this paper builds a social platform-based Chinese network bad text filtering platform.Through the aforementioned hybrid deep learning framework and reinforcement learning method,the automatic classification of unlabeled bad texts can be realized,and the automatic identification and automatic recognition of bad new words on the Internet can be carried out.Visualization of classification results.

Keywords/Search Tags:

Short Text Classification, Network Undesirable Text Filtering, Deep learning, Reinforcement Learning

Related items

1	Study Of Text Filtering Based On WEB Content Security
2	Research On Personalized Malicious Comments Filtering Algorithms Based On Reinforcement Learning
3	Research On Key Technologies Of Short Text Classification Based On Deep Learning
4	Research On Short Text Classification Based On Deep Learning
5	Research On Chinese Short Text Classification Based On Hybrid Neural Network
6	Research And Application Of Short Text Classification Algorithm Based On Deep Learning
7	Research On Text Classification Based On Deep Learning And Topic-driven
8	Research On Key Problems In Text Classification Research Based On Deep Learning
9	Research On Multilingual Short Text Classification Method Based On Deep Learning
10	Research On Multi-label Text Classification Algorithm Based On Deep Reinforcement Learning