A Novel Missing Data Imputation Method Based On K-means Algorithm And Association Rules

Posted on:2015-09-23

Degree:Master

Type:Thesis

Country:China

Candidate:C Wang

Full Text:PDF

GTID:2348330518970633

Subject:Engineering

Abstract/Summary:

With the fast development of technology, it is very common to use computers to manage data information and accumulate tons of data among various industries. In the process of extracting and analyzing data,it is hard to avoid losing data, or the consequences caused by missing data. The key consequences are listed below: lots of key information lost from the system; uncertain factors played more significant roles than they should; impossible or hard to use the normal analyzing methodology to analyze data sets. In addition, loss of data in the data set make analyzing process disorder, reduce the accuracy of analyzing result, or even make the it unreliable. Therefore, it is essential right now to figure out how to deal with the problem of missing data.This paper presents a novel missing data imputation method based on K-means algorithm and association rules. This new missing data imputation method effectively integrates these two algorithms together to achieve better performance. The use of K-means clustering algorithm improves the data similarity, so that the association rule mining algorithm can dig out more strong association rules. The use of association rule mining algorithms can fix the problem of low missing data imputation accuracy of the K-means clustering algorithm. This approach effectively solves the problem of missing data imputation,and improves the accuracy of miss data refilling.This paper also analyze the original K-means clustering algorithm. Presents the new method of K selection by calculating the distance gap between different data clusters. Based on the proposed method this paper gives a reasonable value of K with experimental verification. This paper has also analyze the original association rule mining algorithm.Presents a new solution of the long term existing no suitable association rules problem and association rules confliction problem.

Keywords/Search Tags:

miss data imputation, K-means clustering, Association Rules, Data preprocess

Related items

1	Research On Data Cleaning Based On Clustering
2	Research Of Fuzzy Association Rules Algorithm Based On Data-driven FCM
3	Research On The Improvement Of Association Rules Mining Algorithm And Its Application
4	Research On The Optimization Of Association Rules
5	Research On Financial Loss Customer Mining Model Based On K-MEANS Clustering And Association Model
6	Association Rules Mining And Its Applications In Microarray Gene Expression Data
7	Research And Application Of Incomplete Data Imputation Algorithm Based On Subtractive Clustering
8	Colleges And Universities Teaching Course System Subsystem The Design And Implementation Of Data Mining
9	The Research Of Data Mining Association Rules Based On Web Service
10	Research Of Data Ming In Port Product Data Based On Clustering And Association Rules