Research On Harmful Information Recognition Based On A Bayesian Network Classification Algorithm

Posted on:2020-03-02

Degree:Master

Type:Thesis

Country:China

Candidate:P Ding

Full Text:PDF

GTID:2428330599460192

Subject:Electronic Science and Technology

Abstract/Summary:

The essence of harmful information recognition in text form is text categorization.Spam filtering and network public opinion analysis are considered as binary classification of short text.The sparseness of Chinese texts leads to the problem of high-dimensional features.The feature of Bayesian classification model is incomplete and the assumption of conditional independence is absent.The above shortcomings have become important factors restricting short text categorization.Combining with the actual situation of spam filtering and network public opinion analysis,this paper makes two improvements on feature extraction and structure learning algorithm.Firstly,a term frequency-inverse document frequency algorithm based on headword expansion is proposed.The algorithm combines the structural characteristics of the model to solve the defects of high dimensionality of the feature.The algorithm increases feature diversity and achieves feature dimensionality reduction.Secondly,the genetic algorithm and gray wolf optimization are mixed into a gray wolf optimization-genetic algorithm.The algorithm is used to solve the problem that the feature incompleteness and the assumption of conditional independence is absent.A three-layer model is used to avoid feature incompleteness.The gray wolf optimization-genetic algorithm relaxes the conditional independence assumption between attributes of the classifier model.Finally,the two improved algorithms are applied to spam filtering and network public opinion analysis.Through experimental analysis,it is proved that two improved algorithms and three-layer Bayesian structure model are feasible.The classifier based on the proposed algorithm can improve classification performance.Based on this,a harmful information recognition software based on Bayesian network classifier is designed.

Keywords/Search Tags:

harmful information recognition, text categorization, bayesian classifier, feature extraction, genetic algorithm

Related items

1	A Study On Chinese Text Categorization
2	Research On XML Text Categorization Based On Bayesian Classifier
3	Studies On Some Essential Problems In Automatic Text Categorization
4	The Research And Simulation On The Key Techniques Of Text Mining
5	Design And Realization Of Text Categorization System
6	A Study On Chinese Text Automatic Categorization
7	The Research And Implementation Of Automatic Text Categorization For Chinese Web Documents
8	An examination of KSS for feature selection for text categorization using support vector machines
9	Design And Implementation Of Text Classifier For Enterprise Technology Requirement
10	The Research Of Bayesian Classifier And Its Applications