Clustering Algorithm And Analysis Of Customer Loyalty

Posted on:2004-05-14

Degree:Master

Type:Thesis

Country:China

Candidate:B Zhang

Full Text:PDF

GTID:2208360092498514

Subject:Computer software and theory

Abstract/Summary:

With the development of information technology, the scopes of Data Base application becomes more and more wide, the database query can not deal with the huge quantity data, Data Mining emerges and develops, which is new data analysis technology and helps the decision-maker to make service policy. Data Mining (Knowledge Discovery in Database) means that the knowledge and information is discovered from the dataset, which is connotative, useful and undiscovered. Clustering analysis is an important part of the whole Data Mining system. Clustering is the process of grouping the data into classes or clusters so that objects within the same cluster have high similarity in comparison to one another, but are very different to objects in other clusters. Dissimilarities are assessed base on the attribute values describing the objects. Clustering processes are always carried out in the condition without pre-known knowledge, so the main task is to solve that how to get the clustering result in this premise.Because of the importance and specialize of the cluster analysis in data management, the research in this field get a great advancement in recent years, a number of clustering algorithms has been founded, In general, major clustering methods can be classified into the following categories: Partitioning methods, Hierarchical methods, Densityæ¢‘ased methods, Grid-based methods, Model-based methods, besides these, some clustering algorithms integrate the ideas of several clustering methods. Although all these methods have got great achievement in different fields, these methods all meet difficulties when processing huge quantity dataset. So in this paper we analysis the reason to cluster analysis, and give the detail resolvents. The following problems will be discussed:1. The accuracy of the clustering algorithm: The accuracy of the clustering methods refers to the partitioning accuracy and destination of the original data set. It is easy for present clustering algorithm to process the data set with regular partitioning characters, but it is unsatisfied with huge quantity data set for present algorithm. So this paper will discuss that how to enhance the clustering algorithms' accuracy.2. The large complexity of time and space consuming. Because of the huge quantity and high complexity of the original data set, data mining needs more and more time and memory to deal with these data sets. It is not accuracy in limited resource. Based on the clustering algorithms analysis, this paper selects a cluster algorithm with low complexity to deal with huge quantity dataset.3. Amelioration of hierarchical-based method. The hierarchical-based method is one of the clustering analysis methods to deal with big size data sets. With the limitedresource, such as memory, CPU, and so on, it can get the best clustering result by use some algorithm structure. Because of slowly in clustering result, convergent, and poor in clustering random data sets, there are some obstacles in using. Some algorithms for hierarchical-based method is discussed in this paper.4. Clustering result visualization. Information Visualization is the precondition for human computer interaction to data mining. Because of the data sets extend, it is a hotpot that how to express the high dimensionality data in 2-dimensionality space, and provide a compact and effective visualization interface for the user. A detail analysis and expatiate for the clustering result visualization is gave in this paper.This thesis consists of six sections. Chapter one depicts the background knowledge and illustrates the most important content in data mining. Chapter two describes the method and criterion of clustering analysis, and appraises the clustering algorithms. Chapter three give an amelioration of the hierarchical-based algorithm for the huge quantity data sets, it enhance the accuracy and not add algorithm complexity. Chapter four discusses the clustering result visualization, make summarize for data visualization, in this chapter we give a method to descript high dimen...

Keywords/Search Tags:

Data Mining, Clustering Algorithm, Clustering Feature, Cluster, Information Visualization, Parallel Coordinate

Related items

1	Visual data mining: Using parallel coordinate plots with K-means clustering and color to find correlations in a multidimensional dataset
2	Clustering Algorithm In Data Mining Research
3	An Association Rule Clustering Algorithm Based On K-means And Visualization
4	Algorithm Study On Clustering
5	Data Clustering And Visualization Technology
6	Research On Visualization Techniques And Application For Data Mining Based-on K-means
7	Research And Implementation Of Text Clustering Based On DK-Means
8	Research And Implementation Of Text Clustering Based On Dk-means
9	Research And Implementation Of Visualization For Cluster Process Based On Parallel Coordinates
10	The Research And Application Of Spectral Clustering Algorithm In Data Mining