Font Size: a A A

Research Of Ensemble Super Parent One-Dependence Estimator By Maximizing Conditional Likelihood

Posted on:2016-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:H XuFull Text:PDF
GTID:2308330470967696Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Despite the simplicity of the Naive Bayes, it’s an extremely remarkably effective approach. It depends on the strong assumption that the attributes are independent given the class label. However, this assumption is often violated with the real-world data distribution. The ensemble of SuperParent one-dependence estimators (SPODEs) is one of the most effective improved algorithms. It achieves high classification accuracy while decreasing variance. However, most existing approaches only focus on performance improvement of individual SPODEs in selection and weighting procedures but overlook the importance of the entire ensemble model.Based on the assumption that the performance of the entire ensemble classifier can obtain better weight distribution than using the greedy strategy inside each SPODE, we propose an ensemble SPODE algorithm by maximizing conditional log likelihood (EODE-CLL). First, we choose the maximum conditional probability as the global optimization goal. The main assumption is that the best parameters should sample data from whole model space with the max probability. It can avoid over-fitting problem compared with the least squares error. Second, the algorithm assigns hierarchical weights for SPODEs and the attributes inside SPODE. The second weight layer can help fully optimize local SPODE model. Finally, stochastic gradient descent method is used to search best parameters. It has good scalability, which has spawned batch and distributed version. As whole of above, we proposed the EODE-CLL algorithm, which can fit most existing ensemble models with different parameters.We conduct experiments on 36 UCI datasets, which is a public benchmark from machine learning repository provided by the University of California, Irvine. The results of the experiments show that our EODE-CLL significantly outperforms state-of-the-art ensemble SPODE methods in terms of accuracy, F-measure, bias, and variance.
Keywords/Search Tags:Classification, Machine learning, Modeling Structured, Gradient methods, Conditional Likelihood
PDF Full Text Request
Related items