Font Size: a A A

Research On Bayesian Network-based Multi-dimensional Classification

Posted on:2015-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhangFull Text:PDF
GTID:2308330464466766Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the emergence of big data, multi-dimensional classification has become a hot research topic in data mining. Bayesian network is a probability graph model, which is widely used for uncertainty and data mining. In this paper, we study this topic, and build up classification models based on Bayesian network. Our main work can be outlined as follows:Firstly, the validity and feasibility of Bayesian chain classifiers are analyzed to solve multi-dimensional classification problems, and the initial chain model is improved. For all class variables, we employ the K2 algorithm and Climbing-Hill algorithm to learn a general Bayesian Network for describing the dependency among the class variables, respectively. Next it, the chain classifier is established in the framework of the resulting Bayesian Network. In order to simplify the model, a fast and effective feature subset selection algorithm is adopted to remove the possible redundant and irrelevant features, and the processed features variables are independent of each other. Consequently, we can obtain simplified Bayesian chain classifiers.Secondly, for multi-dimensional classification, the problem of super-exponential increase of the computational complexity of the structure and inference will appear as the number of variables increases. In the paper, we introduce the clustering methods into multi-dimensional classification, and propose a clustering-based multi-dimensional classification model to deal with multi-dimensional classification problems. The Visual Assessment of(Cluster) Tendency algorithm is used to cluster for class variables, and some independent class clusters are obtained. Then, we learn a Bayesian network(chain) classifier for each cluster. Meanwhile, we adopt an improved CFS to select the feature subset for each class variable. The ultimate model is constructed by concatenating the all independent classifiers.Finally, the simulation experiments are carried out on three classical datasets. The results show that our models have more simple structures and good classification performance compared with the existing methods.
Keywords/Search Tags:Bayesian network chain classifiers, Feature selection, Multi-dimensional classification, Clustering
PDF Full Text Request
Related items