Naive Bayes Classification And Application Based On Improved K-means Algorithm

Posted on:2008-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:Y Li

Full Text:PDF

GTID:2178360242960768

Subject:Computer application technology

Abstract/Summary:

Data Mining, also called Knowledge Discovery in Database (KDD), is to extract knowledge or models with potential applicative value from large-size database or data warehouse. The Classification is an important research branch in the Data Mining domain. The classification based on Bayesian approach is a hot spot in Data Mining field currently. However, its prerequisite of condition independence assumption and data completeness limit its real application. This paper uses the improved K-means (IKM) algorithm to process the missing data and thus improve the precision of the Naive Bayes classifier. The main work of the dissertation is as follows:(1) It proposes a Naive Bayes classification model based on improved K-means algorithm (IKM) . The improved algorithm includes both advantages of hierarchical clustering and K-means algorithm, while also get rid of both shortcomings respectively. The essential principle of IKM algorithm is: first to exercise hierarchical clustering and get some initial information, namely, the value of the classification number K and the initial cluster centre. Second to use IKM to refine it, and last to get the clustering result with senior quality.(2) It designs a Naive Bayes Classification based on IKM (IKMNBC). First to use the IKM to cluster complete data subsets of initial data, to calculate the similarity between every record in missing data subsets and the centers K cluster, then to set the record to the nearest cluster and fill the missing value of the record with the mean value of the corresponding attribute of the cluster. Finally, to cluster the handled data set with the Naive Bayes classifier. The experiment through UCI standard test data set proves its feasibility and superior accuracy over ordinary Naive Bayes classification.(3) Based on IKMNBC, this dissertation also designs an assessing system on teaching quality of the faculty. This system represents its competent performance, comprehensive function and easy operation, thus it is convenient for the users to assess the teaching result of the senior-vocational college faculty.

Keywords/Search Tags:

Data-Mining, Teaching quality assessment model, Naive Bayesian classification, Improved K-means algorithm

Related items

1	Research Of Improved Mutual Information-Based Naive Bayesian Classification Model
2	Research On The Approach Of Classification In Data Mining Based On Naive Bayesian
3	Research On The Technology Of E-commerce Product Quality Risk Assessment Based On Data Mining
4	Design And Development Of Teaching Quality Assessment System Based On Data Mining
5	The Improvement Of Two Typical Classification Algorithms
6	Data Mining Applied Research In The Teaching Quality Evaluation
7	The Design And Implementation Of Fujian Normal University Teaching Quality Evaluation System Based On Data Mining
8	Algorithm Based On Improved Naive Bayesian For Predicting Weibo Behavior
9	Augmented naive Bayesian model of classification learning
10	Implementation Of News Classification System Based On The Naive Bayesian