Font Size: a A A

Research On Clustering Algorithms Based On Rough Set

Posted on:2014-03-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:1268330425965116Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the field of data mining, rough sets and clustering algorithms and other issues havebeen studied extensively for many years. More and more scholars have studied the relevantareas, With deeply research of the data mining, and a lot of issues haven’t been used onesingle method to solve, therefore, merging the various theories, and how to construct morecomplex data mining models to solve real-data analysis of the problem, is the focus of currentresearch.With the high speed development of economy, consumption data has become complexand its quantity is much larger, and it also contains high redundancy. As a result, how toextract useful information for economic growth from these multifarious data is one of theresearch contents of data mining, and how to predict future spending from the currentinformation is also one of the hot spot in the current research.In data mining research field, Clustering has been extensively studied by manyresearchers. With further research, many problems can’t be solved with a single method;therefore, how to construct more complex data mining models to solve the problem of realdata analysis is the research focus today.Data mining, also known as Knowledge Discover in Database, is the focus question inthe field of artificial intelligence and database at present. Data mining is the process of adecision support, and it is mainly based on artificial intelligence, machine learning, patternrecognition, statistical, database, and visualization technologies, which can do highlyautomated analysis of enterprise data, and make inductive reasoning, dig out potential modelto help decision-makers to adjust marketing strategies, and to reduce risk and make the rightdecisions. Data mining technology has its own rules, and finds the main data preparationthrough the analysis of each data from the large amounts. Data mining has many tasks, suchas: relationship analysis, cluster analysis, classification analysis, exception analysis, analysisof specific groups and evolution analysis.Cluster and prediction are focus objects in data mining. They play an important role inour life. Prediction has been studied for many years, and some algorithms are useful, such as:naive bayesian prediction method, decision tree prediction method, support vector machine prediction method.Clustering has been studied by researchers for many years, and mainlyconcentrated in distance-based clustering. Cluster analysis is an important human activity, andit has been widely used in many applications, including: market research, the clusteringpattern recognition, biology research, data analysis, spatial analysis, Web document analysisand image processing, etc.This paper has several innovation points, and the specific works are as follows:1. This paper presents an improved algorithm about clustering, which is combined roughset with fuzzy C-means. The improved algorithm takes advantage of the idea ofapproximation set in rough set, and makes it with fuzzy clustering. The algorithm introducesrough set, and expresses the result of clustering as lower approximation set and upperapproximation set, so this paper solves the question that clusters’ boundary is not clear. Thisimproved algorithm is applied to the experimental data, and we find that the effect ofclustering is better than the other algorithms.2. This paper uses an improved algorithm based on rough set and K-means clustering todo research in social network, and it solves the problem of vague border. Social networks’community is similar to the cluster analysis in data mining. However,when the K-meansclustering algorithm is used in social network,there are several disadvantages as following:how to determine the value of K,and the relations among community node and thecommunity. This paper solves the disadvantages using rough set clustering ideas. This methodis mainly used to find overlapping communities, and it can multi-anglely reflect the socialnetwork information better. The algorithm is applied to experiment data, and results show thatthis method improves the accuracy of community division obviously compared with otheralgorithms.3. This paper presents a new prediction model, which combines the clustering withneural network. The existed clustering algorithm has the shortcoming that can not determinethe clustering number K, so this paper combines it with the concept of rough set to improve,and tests it on the IRIS data; Then this paper presents the new forecasting model whichcombines improved clustering algorithm and neural networks to improve the predictionaccuracy more effectively, and tests it in statistics data.4. This paper presents a new prediction model named PCA-RKM-BP model. The modelcombines principal component Analysis (PCA), K-means clustering based on the rough set(RKM) and the BP neural network, and it makes the principal component analysis result ofdata as neural network’s input, and chooses rough set result as the BP neural network’s hidden layer node center. The model makes full use of the advantages of the principal componentanalysis, rough set clustering and BP neural network, and it can effectively enhance theprediction accuracy. After using the new prediction model in consumption data, theexperimental results show that the new prediction model has better prediction effect comparedwith the linear prediction method and traditional BP neural network.
Keywords/Search Tags:Cluster analysis, Rough set, Data mining, neural network, Prediction
PDF Full Text Request
Related items