Font Size: a A A

Naive Bayes Classification And Application Based On Improved K-means Algorithm

Posted on:2008-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2178360242960768Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data Mining, also called Knowledge Discovery in Database (KDD), is to extract knowledge or models with potential applicative value from large-size database or data warehouse. The Classification is an important research branch in the Data Mining domain. The classification based on Bayesian approach is a hot spot in Data Mining field currently. However, its prerequisite of condition independence assumption and data completeness limit its real application. This paper uses the improved K-means (IKM) algorithm to process the missing data and thus improve the precision of the Naive Bayes classifier. The main work of the dissertation is as follows:(1) It proposes a Naive Bayes classification model based on improved K-means algorithm (IKM) . The improved algorithm includes both advantages of hierarchical clustering and K-means algorithm, while also get rid of both shortcomings respectively. The essential principle of IKM algorithm is: first to exercise hierarchical clustering and get some initial information, namely, the value of the classification number K and the initial cluster centre. Second to use IKM to refine it, and last to get the clustering result with senior quality.(2) It designs a Naive Bayes Classification based on IKM (IKMNBC). First to use the IKM to cluster complete data subsets of initial data, to calculate the similarity between every record in missing data subsets and the centers K cluster, then to set the record to the nearest cluster and fill the missing value of the record with the mean value of the corresponding attribute of the cluster. Finally, to cluster the handled data set with the Naive Bayes classifier. The experiment through UCI standard test data set proves its feasibility and superior accuracy over ordinary Naive Bayes classification.(3) Based on IKMNBC, this dissertation also designs an assessing system on teaching quality of the faculty. This system represents its competent performance, comprehensive function and easy operation, thus it is convenient for the users to assess the teaching result of the senior-vocational college faculty.
Keywords/Search Tags:Data-Mining, Teaching quality assessment model, Naive Bayesian classification, Improved K-means algorithm
PDF Full Text Request
Related items