Font Size: a A A

Bayesian Classification Algorithm Based On Attribute Discretization And Its Application

Posted on:2018-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:N LiFull Text:PDF
GTID:2348330515998090Subject:Engineering
Abstract/Summary:PDF Full Text Request
The naive bayesian classification algorithm is one of the ten classical algorithms of data mining because of its simple and efficient advantages.However,the algorithm assumes that the attributes are independent of each other.However,in practical applications,this assumption is usually can't be established.In this paper,the classification accuracy of the algorithm is improved by data pretreatment and weakening the conditional independence hypothesis of naive bayesian classification algorithm.The main research work is as follows:Discretization is a commonly used data preprocessing technique.Known data discretization algorithms are often less ideal when dealing with unbalanced datasets.In this paper,a new ICAIM algorithm for discretization is proposed,which improves the CAIM algorithm by combining the advantages of three different discretization criteria.The ICAIM algorithm improves the quality of the discrete interval,making the classification of the discretized dataset better,especially the unbalanced dataset.At the same time,the running time of ICAIM algorithm is obviously improved compared with CAIM algorithm.In this paper,we propose to use the CFS algorithm to select the attributes,and select a set of optimal attribute subset.In this paper,we propose a new algorithm to solve this problem.The attribute weighting method has attracted the attention of researchers in the method of conditional independence hypothesis of many naive bayesian classification algorithms.In order to further reduce the negative impact of the conditional independence hypothesis,this paper will give different weights to different attributes according to the contribution degree of each attribute to the classification result.The method of weight acquisition not only takes into account the dependencies between attributes and attributes,but also takes into account the dependencies between attributes and class attributes,making the weights obtained by each attribute more reasonable.The existing attribute weighting method only divides the learned attribute weight into the implicitly defined bayesian classification formula,and does not incorporate the weight into its conditional probability estimation.This paper uses a method called deep attribute weighted to improve the naive bayesian classification model.The ICAIM algorithm and SW-HNB algorithm proposed in this paper are applied to the TCM adjuvant system of coronary heart disease.The clinical data of coronary heart disease were discretized by ICAIM algorithm,and then SW-HNB algorithm was used to classify the patients' disease.Experiments show that the system has played an effective role in supporting the diagnosis and treatment.
Keywords/Search Tags:Data Mining, Discretization, Hidden Naive Bayes, Attribute Selection, Attribute Weighting
PDF Full Text Request
Related items