Font Size: a A A

Improvement And Application Of Naive Bayes Aglorithm Based On Attribute Selection Weighting

Posted on:2018-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z BaiFull Text:PDF
GTID:2348330533966288Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularization of information technology and the arrival of big data era, the requirement of data depth analysis is more and more important. Data mining technology is an effective tool to realize the transformation from information to knowledge. The Naive Bayesian algorithm is one of the ten classical algorithms of international authoritative data mining conference selected out of the field of data mining, Naive Bayesian model originated in classical probability theory, has a solid mathematical basis, and the stability of the classification efficiency. At the same time, it needs less estimation parameters, less sensitive to missing data,and the algorithm is relatively simple. Theoretically, the naive Bayes model has the least error rate compared with other classification algorithms. However, the assumption that attributes are independent of each other, and this assumption is often not true in practical applications. When the number of attributes is large or the correlation between attributes is large, the model performance will be reduced.This paper mainly aims at the deficiency of naive Bayes algorithm, and improves it in two aspects of attribute selection and attribute weighting. In the choice of properties, first introduced the information value to get the first round and the category is associated with a higher degree of attribute subset, then filtering the redundant attributes and get the second round attribute subset structure Naive Bayesian classification model in two attribute subsets respectively. The analysis shows that the simple Bias classification model constructed by two rounds of attribute selection for the initial attribute set can not only reduce the dimensionality of the attribute, but also improve the classification accuracy. In the weighted attribute, analysis method of quantitative knowledge through experience level, to adjust the weights of training samples, get more comprehensive weights, according to the degree of importance of attribute value in the formula of the posterior probability weighting of Naive Bayesian classification calculation, improve classification accuracy rate. Finally, combining attribute selection and weighted attribute of advantage, Naive Bayesian algorithm to select the algorithm by weighted information value index set was two times of the initial attribute attribute selection, and then calculate the weights through Analytic Hierarchy Process, weighted Naive Bayesian classifier in the best subset, and verified by the experiment in general data set. The Naive Bayesian algorithm reasonably applied to spam messages the user identification model in telecommunication, through experiments to prove its effectiveness in the Spark platform, so as to further improve the work efficiency of municipal solid waste information management,information management and technical optimization of garbage.
Keywords/Search Tags:Data mining, naive bayesian, attribute selection, attribute weighting, information govemam
PDF Full Text Request
Related items